SWE-rebench / SWE-rebench-V2
PublicTools and prompt templates used to build and evaluate SWE-rebench-v2 tasks for the paper.
A set of tools for preparing prompts, creating isolated test environments, and evaluating code fixes on software engineering benchmark tasks.
How It Works
You come across this collection of tools while reading about testing AI on real-world coding challenges from a research paper.
You pick a list of software problems, like fixing bugs or adding features, using sample examples or shared collections.
You generate simple, smart instructions that help AI understand and label each coding challenge clearly.
You set up safe, isolated areas where each challenge can run independently without interference.
You apply suggested code changes to the challenges and run the built-in tests to see what works.
You receive a clear summary showing which fixes passed all tests and insights into AI performance.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.