Native MTP Speculative Decoding On Apple Silicon | 2x - 2.5x decode TPS increase at temp 0.6 | MLX-native, OpenAI API/Anthropic-compatible serving, no external drafter.
MTPLX speeds up AI language model responses on Apple Silicon Macs by using the model's built-in drafters for speculative decoding without extra memory use.
How It Works
You hear about MTPLX, a simple way to make powerful AI conversations zoom along on Apple Silicon without needing extra hardware.
Run a quick command from your Mac terminal to download and set everything up automatically.
The friendly setup wizard grabs a speedy AI model and opens a chat window or terminal ready to go.
Type your questions and watch the AI reply super fast, with live speed stats and easy controls.
Enjoy a full chat interface with buttons, formatting, and settings that save automatically.
Get quick text-based replies right in your command line for speedy back-and-forth.
Link it to apps like Open WebUI or your code editor for AI help anywhere.
You now have the quickest local AI chats on your Mac, saving time on every conversation.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.