Skip to content

An approximate implementation of A. Karpathy's Let's build GPT, with MLX

License

Notifications You must be signed in to change notification settings

DiogoNeves/mlx-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mlx-gpt

This learning project implements a GPT language model using Apple's MLX library, following Andrej Karpathy's Let's build GPT video.

🚀 Getting Started

I tried to stay as close as possible to the original material, so that it's easy to follow.
I recommend watching the walkthrough if you haven't yet!

Instalation

# Setup the environment
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

🤖 Usage

Train and run the Bigram model

At the moment the command below trains and runs the model straight away.
It will also download and cache the data if needed.

python bigram.py

Validation Roughly comparing the results in the video with my results as validation.

Video MLX
CleanShot 2024-03-25 at 09 54 24@2x CleanShot 2024-03-25 at 09 51 19@2x

Both converge to a similar value (Please ignore the formatting issues)

Train and run the GPT model

Coming soon...

Other

You can inspect the experimental notebook I created while following the video at experiment.ipynb. More understandable if you follow the video.
Tested on Macbook Air M1.

📦 Dependencies

📜 License

MIT License

About

An approximate implementation of A. Karpathy's Let's build GPT, with MLX

Topics

Resources

License

Stars

Watchers

Forks