High-Performance KV Cache Storage Engine on CXL Shared Memory for LLM Inference
Maru provides a high-performance key-value storage system using CXL shared memory to enable low-latency KV cache sharing for multiple AI model instances.
How It Works
You hear about Maru, a clever way to make AI conversations zoom by letting multiple computers share the same memory instantly.
You grab a modern Ubuntu machine with fast shared memory hardware, just like setting up a new toolbox.
You follow simple steps to install everything with one helpful script, like plugging in a new gadget.
With a quick command, you launch the background helper that creates a big shared memory playground for your AIs.
You add a few friendly lines to your AI code to connect it to the shared playground.
Your AI saves its thinking notes right into the shared space, and others grab them without delay.
Now your AIs work together seamlessly, delivering super quick responses like magic!
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.