akitaonrails / llm-coding-benchmark
PublicSimple benchmark to test the most popular open source and commercial LLMs with automated OpenCode
This project benchmarks various AI language models on their ability to autonomously generate a complete Ruby on Rails web application from a fixed prompt, including validation of runtime functionality.
How It Works
You stumble upon this handy tool that lets everyday folks compare how well different AI helpers can build a full website on their own.
You prepare your local AI brains by warming them up so they can handle big tasks without hiccups.
You choose which AI models from the list to test, mixing local ones on your computer and cloud ones.
With one simple command, you kick off the challenge where each AI tries to create a complete website step by step.
You sit back as the AIs generate code, files, and even test if everything runs smoothly.
Automatic reports pop up showing times, file counts, success rates, and which AI built the best site.
You now know exactly which AI is the champion coder, with working websites to explore and share.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.