Back to Projects

Featuredadvancedactive

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Author:karpathy

Stars:49315

Language:Python

Updated:December 9, 2024

View on GitHubMIT

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Overview

The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education. Still under active development, but currently the file train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code itself is plain and readable: train.py is a ~300-line boilerplate training loop and model.py a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.

Key Features

Check the repository for more details

Statistics

⭐ Stars: 49,315
🍴 Forks: 8,256
📝 Language: Python
📜 License: MIT

Links

GitHub Repository

Getting Started

Visit the GitHub repository for installation instructions and documentation.

This project information was automatically generated from GitHub. Last updated: 12/9/2024

Related Projects

Featuredadvancedactive

5327

beads

Beads - A memory upgrade for your coding agent

By steveyegge

Go• MIT

agents claude-code coding

View Project GitHub

advancedactive

182

learning-diffusion

A practical guide to diffusion models, implemented from scratch.

By ludocomito

Jupyter Notebook

View Project GitHub

intermediateactive

412

MobiAgent

The Intelligent GUI Agent for Mobile Phones

By IPADS-SAI

Python• Apache-2.0

View Project GitHub