Educational code companion to the book 'The Physics of LLM Inference', offering runnable examples, benchmarks, and tests organized by chapter to explore transformer mechanics, generation loops, optimizations, batching, and production serving.
How It Works
You discover this collection of hands-on examples that teach the inner workings of AI language models, like a workbook for a special book.
You follow the easy setup steps to prepare your computer, and soon your playground is open.
You play with pieces like attention and feed-forward networks, watching a simple language model come alive.
You measure how fast different parts run, discovering why some tricks make everything quicker.
You run checks to confirm every example behaves just right.
Now you truly understand the physics of making AI chat fast and efficient, ready to build your own.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.