scoootscooob / clawbench
PublicRigorous benchmark for AI models as OpenClaw agents. Runs on HF Spaces.
ClawBench is a benchmark for testing AI agent setups on practical software tasks using reliable checks and helpful diagnostics.
How It Works
You hear about ClawBench, a fun way to test how well different AI helpers handle everyday computer tasks like fixing code or organizing files.
Download the simple tool and start it up on your computer – it takes just a minute.
Choose your favorite AI model and add any special tools or setups you want it to use.
Hit go, and watch your AI tackle a series of real-world challenges like debugging apps or summarizing data.
Get a clear report showing how well it did, with scores for accuracy, smart steps, and reliability.
Discover exactly what worked great and simple changes to make your AI even better next time.
Now you know your setup's strengths and how to level it up for tougher jobs.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.