geocodebench

[CVPR' 26] Benchmarking PhD-Level Coding in 3D Geometric Computer Vision

21
1
100% credibility
Found Apr 05, 2026 at 20 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

GeoCodeBench benchmarks large language models on implementing complex 3D geometric computer vision algorithms from research papers using fill-in-the-blank coding tasks and unit tests.

How It Works

1
🔍 Discover GeoCodeBench

You find this benchmark on arXiv or the project page while researching AI coding abilities in 3D vision.

2
⚙️ Set up your workspace

Create a simple environment so everything runs smoothly on your computer.

3
🔗 Connect smart AI helpers

Link AI services like chat models so they can generate code solutions.

4
Let AIs solve the challenges

Watch as different AIs read papers and fill in missing code for tough 3D vision problems.

5
📝 Pull out the code

Extract the AI-generated code into runnable files ready for testing.

6
🧪 Run the tests

Automatically check how well each AI's code handles edge cases with unit tests.

📊 See the rankings

Get clear summaries showing which AIs excel at PhD-level 3D coding tasks.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 20 to 21 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is GeoCodeBench?

GeoCodeBench lets you benchmark LLMs on PhD-level coding tasks pulled from recent CVPR papers like UVGS and 3DGUT, focusing on 3D geometric computer vision. Feed it structured paper content and masked code snippets; it generates LLM implementations, extracts runnable Python code, runs edge-case unit tests, and summarizes pass rates by capability like geometric transformations or novel algorithms. Developers get a ready-to-run Python suite for evaluating AI coding on vision pipelines.

Why is it gaining traction?

Unlike generic coding benchmarks, it targets research-grade 3D vision from CVPR 2026 github repos and arXiv papers, with prompts using full papers or methods sections for realistic eval. Batch scripts handle multiple models (GPT, Claude, Gemini) in parallel, auto-extract code from responses, and produce CSV summaries grouped by question type—perfect for comparing CVPR papers github implementations. The hierarchy from basic transforms to full pipelines hooks LLM researchers tracking vision coding progress.

Who should use this?

AI researchers fine-tuning LLMs for computer vision coding, especially those replicating CVPR 2023 github or CVPR 2025 papers github projects. Geometric vision devs testing if models can implement splatting or projection algos from CVPR poster github templates. Benchmarking teams needing PhD-level python vision tasks beyond toy problems.

Verdict

Grab it if you're into LLM-vision evals—solid unit tests and automation make it practical despite 19 stars and 1.0% credibility score. Early CVPR 2026 stage means watch for updates, but docs and scripts are dev-ready now.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.