Chleba

Chleba / ollamaMQ

Public

High-performance Ollama proxy with per-user fair-share queuing, round-robin scheduling, and a real-time TUI dashboard. Built in Rust.

64
6
100% credibility
Found Mar 02, 2026 at 40 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Rust
AI Summary

ollamaMQ is a smart manager that queues and fairly distributes requests to an AI service among multiple users, with a live dashboard for monitoring.

How It Works

1
🔍 Hear about ollamaMQ

You learn about this helpful tool when you want to share your family AI helper with friends without anyone hogging it and causing slowdowns.

2
📥 Get the tool

You grab the tool with a simple download and setup on your computer, like installing any helpful app.

3
🚀 Start it up

You launch the tool and tell it where your AI helper is waiting, so it can watch over everything.

4
📊 Watch the live dashboard

A colorful screen pops up showing who's waiting and how things are flowing, like a friendly traffic controller.

5
💬 Friends send messages

Your friends chat with the AI by including their name, and the tool lines them up fairly so everyone gets a turn.

😊 Enjoy smooth sharing

Everyone chats happily without long waits or crashes, and you see the magic balancing act in real time.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 40 to 64 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is ollamaMQ?

ollamaMQ is a Rust-built high-performance proxy for Ollama that queues incoming API requests per user via the X-User-ID header, then dispatches them with fair-share round-robin scheduling to avoid GPU hogging in multi-user setups. It supports full streaming responses, OpenAI-compatible endpoints like /v1/chat/completions, and Ollama natives like /api/chat. A real-time TUI dashboard lets you monitor queues, throughput, and block abusive IPs or users on the fly.

Why is it gaining traction?

In a sea of basic Ollama wrappers, ollamaMQ stands out with per-user queuing and fair-share logic that actually enforces equity without custom scripting, plus seamless Docker Compose deployment and stress-test scripts for load validation. Devs dig the TUI for live insights into queue depths and drops, and OpenAI compatibility means drop-in use with tools like LangChain. It's a rust high performance github gem for backend queuing without the bloat.

Who should use this?

Multi-user AI teams running shared Ollama instances, like dev squads testing LLMs or internal chatbots where one power user shouldn't starve others. Ideal for ops folks spinning up demo servers or high-performance backend github proxies for GPU-limited environments. Skip if you're solo or need enterprise-scale auth.

Verdict

Grab it via cargo install ollamaMQ for quick multi-tenant wins—docs and Docker are polished—but with 23 stars and 1.0% credibility score, it's early alpha; test under load before prod. Solid for prototypes, watch for community hardening.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.