grigio / opencode-benchmark-dashboard
PublicBenchmark system for testing opencode with various LLM models, measuring speed (latency) and correctness (accuracy).
A local dashboard tool for testing various AI language models on custom prompts to measure their response speed and answer accuracy.
How It Works
You hear about a handy tool that lets you compare different AI assistants on how fast and accurate they are at solving real tasks.
You create a folder with simple task descriptions and their correct answers, like coding challenges or quick questions.
You choose one AI assistant and let it try answering all your test questions, noting how long it takes.
Another smart AI reviews each answer to score how correct it is, giving you reliable pass or fail marks.
With one click, a colorful webpage opens on your computer showing charts, heatmaps, and comparisons.
You run the same tests on different AI assistants to see which ones shine in speed and smarts.
You easily spot the best AI for your needs, balancing quick responses with spot-on accuracy, ready to use in your projects.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.