t3dotgg / azure-bench

Public

Showing azure foundry's awful perf on openai models

100% credibility

Found May 01, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

TypeScript

AI Summary

A tool that runs repeated tests on AI services to measure response speed, token throughput, and costs, displaying results in an interactive web dashboard comparing Azure-hosted models to direct OpenAI.

How It Works

📰 Discover the benchmark

You hear claims that one AI hosting service is slower than another and find this tool to check the facts yourself.

📊 View live charts

Open the webpage to see colorful graphs tracking speed and cost over time for different services.

🚀 Launch speed tests

Start running tests by connecting your AI services, and watch it measure responses to real questions.

⏳ Tests run automatically

The tool sends sample questions repeatedly, timing how quickly answers stream back and noting any hiccups.

💾 Results saved and updated

All measurements get stored, refreshing the charts with your fresh data alongside past runs.

🎉 Understand performance clearly

Now you have easy-to-read comparisons showing which service is faster and cheaper, helping you choose wisely.

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 19 to 19 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is azure-bench?

Azure-bench is a TypeScript benchmarking tool that runs automated tests on Azure-hosted OpenAI models versus direct OpenAI API calls, measuring real-world metrics like tokens per second, time to first token, and costs. It solves the problem of unpredictable Azure inference latency—often unusably slow 20% of the time—by generating live performance data via a cron-scheduled server. Users get a React dashboard at azure.t3.gg with interactive charts, raw JSON endpoints, and a debug table of all runs.

Why is it gaining traction?

It stands out with blunt, public proof of Azure's perf gaps via P90 latency charts and head-to-head OpenAI comparisons, no fluff. Devs hook on the one-command setup (set API keys, `bun run`) and optional Postgres persistence for history. The live-updating dashboard exposes failure rates and costs that sales pitches ignore.

Who should use this?

AI engineers benchmarking Azure OpenAI for production apps, especially latency-sensitive chatbots or RAG pipelines. Teams evaluating azure benchmark tools before committing to Azure cloud security benchmark policy compliance with inference workloads. Ops folks debugging slow endpoints or comparing providers.

Verdict

Grab it if you're on the Azure OpenAI fence—it's a quick reality check with solid metrics. At 19 stars and 1.0% credibility, it's early (README-only docs, no tests), so fork and harden for prod use.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

18,341

Followers

Base stars: 19 stars

Penalty: Very new repo (0d): -70%

Bonus: AI verified quality (100%)

Account age: 4,452 days

Repo age: 0 days

License: MIT

Updated: May 01, 2026