GPT-2 from scratch

chizkidd · chizkidd · commit c7655d518c36 · 2025-08-03T16:44:12.000-05:00
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 This repository contains my detailed implementations of Andrej Karpathy's [Neural Networks: Zero to Hero youtube](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) lecture series & exercises in Jupyter Notebook. The notebooks go into extreme details to ensure a proper, robust fundamental understanding of the deep learning concepts being covered. This lecture series covers neural networks and how to build them from scratch in code. The course starts with the basics of backpropagation, then proceeds to multi-layer perceptrons (MLPs), convolutional neural networks (CNNs), and finally builds up to modern deep neural networks like generative pre-trained transformers (GPTs). The course also introduces and covers diagnostic tools for understanding neural networks dynamics and performance. Finally, this course covers the basics of tokenization, byte-pair encoding (BPE), and how to build a tokenizer from scratch. The field of focus in this course is language modeling (LM) because not only are language models a superb place to learn deep learning but also most of the skills learned here are immediately transferable to other fields of deep learning such as computer vision (CV). 
 
 ```python
-[NN] ---> [MLP] ---> [CNN] ---> [GPT] ---> [BPE]
+[NN] ---> [MLP] ---> [CNN] ---> [GPT] ---> [BPE] ---> [GPT-2]
 ```
 
 
@@ -17,6 +17,7 @@ This repository contains my detailed implementations of Andrej Karpathy's [Neura
 | Makemore 5 | [Wavenet](006_makemore_WaveNet/makemore_WaveNet.ipynb) | [Wavenet Exercises](006_makemore_WaveNet/WaveNet_Exercises.ipynb)
 | GPT | [GPT](007_GPT/gpt.ipynb) | [1, 2a](007_GPT/ex1-2a.ipynb), [2b](007_GPT/ex2b.ipynb), [3](007_GPT/ex3.ipynb)
 | minBPE | [minBPE](008_minBPE/minbpe.ipynb) | [minBPE Exercises](008_minBPE/minbpe-exercises.ipynb)
+| GPT-2 | [GPT-2](009_GPT-2/gpt-2.ipynb) | ---
 
 Still Ongoing... 
 
@@ -28,6 +29,8 @@ Still Ongoing...
 - [makemore Repository](https://github.com/karpathy/makemore)
 - [GPT Repository](https://github.com/karpathy/ng-video-lecture)
 - [minBPE Repository](https://github.com/karpathy/minBPE)
+- [nanoGPT Repository](https://github.com/karpathy/nanoGPT)
+
 ## Acknowledgments
 
 Thank you Andrej Karpathy for creating and sharing the **"Neural Networks: Zero to Hero"** tutorial lecture series on [Youtube](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) & [Github](https://github.com/karpathy/nn-zero-to-hero/tree/master). Your dedication and expertise has made this resource possible. Find more of Andrej's work on his [GitHub](https://github.com/karpathy), [Twitter](https://x.com/karpathy), [Website](https://karpathy.ai) and [AI native school, Eureka Labs](https://github.com/EurekaLabsAI).