SwarmOne

Open-source benchmark for LLM inference under agentic swarm workloads

44
4
100% credibility
Found Apr 14, 2026 at 29 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

An open-source benchmark suite that measures AI language model servers' performance under multi-turn, high-context coding workloads simulating teams of AI agents.

How It Works

1
🔍 Discover the benchmark

You hear about a free tool that tests how well your AI server handles teams of coding helpers working together at once.

2
📦 Get it set up

You add the tool to your computer in seconds, like installing a helpful app.

3
🔗 Point it at your AI

You tell the tool where your AI thinking service lives, so it can send test messages.

4
Pick your test style
Quick speed test

Run a fast check to see basic response times with a few helpers.

📈
Full deep dive

Sweep through busy scenarios and get a full performance story.

5
🚀 Watch the magic

The tool floods your AI with pretend coding chats, growing bigger like a real project, and measures every pause and speed.

6
📊 See your results

Get easy reports with green lights, charts, and clear advice on what's fast or slow.

Know your AI's power

You now understand if your setup shines for group AI coding adventures or needs tweaks.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 29 to 44 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is agentic-swarm-bench?

Agentic-swarm-bench is a Python CLI tool that benchmarks LLM inference speed and quality under realistic agentic swarm workloads, like multi-turn coding sessions in Claude Code or Cursor with growing 6K-400K token contexts, tool calls, and file contents. It points at any OpenAI-compatible endpoint—vLLM, SGLang, TGI, or cloud APIs—and runs modes like `asb speed` for TTFT/tok/s metrics, `asb eval` for code correctness, or `asb record/replay` for capturing real sessions. As an open source benchmark llm and github open source tool, it fills the gap left by SWE-bench (quality-only) or LMSys (short chats).

Why is it gaining traction?

Unlike generic LLM benchmarks with uniform short prompts, it simulates agentic realities: prefix cache defeat via unique salts, growing contexts across coding session stages, reasoning token detection, and Docker one-liners for GPU/Linux/PC/Windows testing. Record your Cursor session once, replay against new hardware for apples-to-apples throughput comparisons, and get Markdown reports with verdicts like "GOOD for agentic swarm." Open source benchmark gpu tools like this hook devs tuning inference stacks.

Who should use this?

AI infra engineers deploying vLLM or SGLang on A100/H100 clusters for agentic apps. Teams building AI coding agents needing to validate endpoint latency at 40K+ contexts with 32 concurrent users. Open source benchmarking tool users stress-testing local setups before production.

Verdict

Grab it if agentic workloads are your bottleneck—install via pip, Docker-ready, solid docs and Apache 2.0—but at 14 stars and 1.0% credibility, it's early; expect bugs in edge cases like custom APIs. Run `asb speed --suite full` on your stack today to baseline.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.