Aiziyou918 / radixInfer
PublicradixInfer is a layered LLM serving system with a runnable end-to-end control plane, separating API, transport, runtime scheduling, cache management, and engine execution for clarity, extensibility, and performance experimentation.radixInfer 是一个分层清晰的 LLM Serving 系统,具备可运行的端到端控制平面,将 API、传输层、运行时调度、缓存管理和执行引擎解耦,便于理解、扩展和性能实验。
radixInfer is a high-performance serving system for running large language models locally with efficient caching, scheduling, and support for models like Llama, Mistral, and Qwen.
How It Works
You stumble upon radixInfer, a speedy tool for running smart AI chatbots right on your own computer.
Follow a quick guide to install it with one easy command, no tech hassle needed.
Choose a clever language model like a lightweight Qwen to power your chats.
Click start, and watch your personal AI server come alive, ready for action.
Type your questions in the interactive chat, and get thoughtful replies streaming back.
Delight in lightning-quick responses that feel magical, perfect for everyday fun.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.