nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Overview
The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education. Still under active development, but currently the file train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code itself is plain and readable: train.py is a ~300-line boilerplate training loop and model.py a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.
Key Features
- Check the repository for more details
Statistics
- ⭐ Stars: 49,315
- 🍴 Forks: 8,256
- 📝 Language: Python
- 📜 License: MIT
Links
Getting Started
Visit the GitHub repository for installation instructions and documentation.
This project information was automatically generated from GitHub. Last updated: 12/9/2024