vishwanathakuthota / openvals
PublicOpen-source AI model evaluation and benchmarking framework for LLMs (OpenAI, Ollama, Claude, Gemini)
OpenVals is an open-source Python tool for evaluating and benchmarking AI language models on metrics like accuracy, semantic similarity, latency, reliability, and safety to recommend the best model for specific tasks.
How It Works
You hear about a friendly tool that helps test and compare different AI assistants to find the best one for your needs.
You add the tool to your computer with an easy download, and everything is ready to go.
You make a simple list of questions and expected answers to see how well AIs perform.
You connect the AI assistants you want to test, like local ones running on your machine.
You press go, and the tool asks each AI your questions, measuring speed, accuracy, safety, and more.
You see a clear leaderboard with scores, showing which AI excels in reliability, quickness, and trustworthiness.
Now you have a full report to pick the perfect AI for your work, knowing it's safe and effective.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.