sharpner / turboquant-mlx
PublicA proof of concept of googles TurboQuant Paper https://arxiv.org/abs/2504.19874
This project squeezes the memory footprint of AI language models on Apple Silicon to enable faster generation and longer contexts with minimal quality loss.
How It Works
You hear about a clever way to make AI conversations on your Mac run much faster and use less memory, even for really long talks.
You grab the simple free tools needed, just a quick download for Apple computers.
You choose a smart AI model like a chatty assistant that's already tuned for your Mac.
Pick the fast path for quick replies in long chats.
Choose top quality for the sharpest, most accurate responses.
You type a question and watch the AI respond right away.
Responses fly out super fast, using way less memory, even after thousands of words!
Now you can have endless, smooth AI conversations without any slowdowns.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.