scrya-com / rotorquant
PublicRotorQuant: Clifford algebra vector quantization for LLM KV cache compression. 10-19x faster than TurboQuant, 44x fewer parameters.
Implements TurboQuant and an improved RotorQuant method to dramatically compress the memory caches used by large language models during extended text processing, enabling longer contexts with minimal accuracy loss.
How It Works
You learn about a clever trick to help AI chatbots like those in apps remember super long conversations without eating up all your computer's memory.
Download this free helper pack that works with your AI projects to shrink memory use.
Simply attach it to your AI language model, and it quietly starts squeezing the working memory to save space.
Run easy built-in checks to confirm it works smoothly on sample data.
Watch benchmarks show your AI handling massive texts 5x smaller in memory and up to 19x faster.
Drop it into your chatbot or text generator for endless conversations without slowdowns.
Celebrate as your AI now manages book-length chats on regular hardware, staying sharp and speedy.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.