How-to-Train-Your-GPT
Build a LLaMA-style model from scratch with zero ML prerequisites or math.
Build a modern LLM from scratch. Every line commented. Explained like we are five.
Explains attention mechanisms to five-year-olds while building LLaMA 3 from scratch.
Python developers wanting to understand Transformer internals without a PhD
Andrej Karpathy's nanoGPT · The 'Build a GPT' tutorial series
Build a LLaMA-style model from scratch with zero ML prerequisites or math.
Karpathy's microGPT in the browser with live loss curves, but pedagogical only—no production value.
Clever n-gram TF-IDF detection of LLM paraphrases catches smart evasion; solves real HN problem but narrow use case.
Gamified AI education beats textbooks, but concept-driven learning exists elsewhere.
Three.js renders real GPT-2 attention patterns you can actually explore interactively.
Interactive LLM explainer covering tokenization through KV cache across 15 chapters.