Featuredadvancedactive

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Author:karpathy
Stars:49315
Language:Python
Updated:December 9, 2024

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Overview

The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education. Still under active development, but currently the file train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code itself is plain and readable: train.py is a ~300-line boilerplate training loop and model.py a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.

Key Features

  • Check the repository for more details

Statistics

  • ⭐ Stars: 49,315
  • 🍴 Forks: 8,256
  • 📝 Language: Python
  • 📜 License: MIT

Links

Getting Started

Visit the GitHub repository for installation instructions and documentation.


This project information was automatically generated from GitHub. Last updated: 12/9/2024

Related Projects

intermediateactive
412

MobiAgent

The Intelligent GUI Agent for Mobile Phones

By IPADS-SAI
PythonApache-2.0
Featuredintermediateactive
1428

reader3

Quick illustration of how one can easily read books together with LLMs. It's great and I highly recommend it.

By karpathy
Python
intermediateactive
162

PairTranslate

A browser extension for side-by-side translation of web pages

By Cookee24
TypeScriptGPL-3.0