rdmsr

rdmsr / sectorllm

Public

The world's smallest llama2 inference engine

18
1
100% credibility
Found May 05, 2026 at 18 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Assembly
AI Summary

A minimal boot sector program that runs a tiny quantized Llama2 model to generate text stories directly from disk without an operating system.

How It Works

1
📰 Discover Tiny AI Storyteller

You hear about a super small AI that tells children's stories right when a computer starts up, before any operating system loads.

2
📥 Grab the Files

Download the project files and the tiny story model from trusted online sources.

3
🧠 Prepare the Story Brain

Use simple tools on your computer to shrink and ready the model so it fits in the tiniest space possible.

4
💾 Create Boot Disk

Combine everything into a special disk image that can boot directly.

5
▶️ Launch the Virtual Boot

Start it up in a virtual old-school computer simulator and watch the magic happen.

Enjoy Generated Stories

See the AI generate fun children's stories on its own, proving how tiny and clever it can be.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 18 to 18 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is sectorllm?

sectorllm runs a full Llama2 inference engine in just 1356 bytes of x86 real mode assembly, booting straight from disk to load a quantized tiny model and generate text like children's stories before any OS starts. Developers get a bootable image via simple commands—download models, quantize with Python, then run in QEMU—to see transformer forward passes and greedy sampling on a 260K-parameter model with 512-token vocab and context. It's assembly code golf for AI, proving LLMs can squeeze into the world's smallest boot sectors, akin to the world's smallest Rubik's cube or violin.

Why is it gaining traction?

It stands out by cramming complete Llama2 inference—no OS, no runtime—into a single boot sector, beating bloated alternatives like full-stack LLM servers. Devs dig the raw minimalism: quantize once, boot, and watch it generate on real-mode hardware, with fused weights and lookup tables keeping it under 1.5KB. In a sea of Python-heavy github world dashboards and monitors, this assembly beast hooks low-level hackers chasing size records.

Who should use this?

Assembly wizards tweaking bootloaders or code golfing contests. Embedded engineers porting tiny AI to bare-metal devices like github world map projects or monitors. Systems programmers experimenting with pre-OS inference, or retro computing fans running LLMs on ancient x86 without modern bloat.

Verdict

Fun proof-of-concept for extreme minimalism, but with 18 stars and 1.0% credibility score, it's immature—hardcoded model/prompt, no tests, golfed for size over speed. Fork it if you're an assembly god shrinking binaries further; skip for production.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.