aim-uofa / GSI-Bench

Public

[CVPR2026] Exploring Spatial Intelligence from a Generative Perspective

aim-uofa.github.ioGSI-Bench spatial-intelligence

100% credibility

Found Apr 26, 2026 at 18 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

GSI-Bench is an evaluation framework for testing generative AI models' understanding of 3D spatial relationships in indoor scenes through metrics like instruction compliance, spatial accuracy, appearance consistency, and edit locality.

How It Works

🔍 Discover GSI-Bench

You find this free benchmark on GitHub to test how well AI image editors handle 3D spatial instructions like 'move the chair left'.

🛠️ Set up your playground

Install a simple Python setup so your computer can run the tests smoothly.

📥 Download test scenes

Grab ready-made indoor room photos with exact 'before and after' edits for comparison.

🤖 Feed your AI images

Run your generative model on the scenes with instructions, creating edited versions.

✅ Launch the evaluator

Hit run to automatically check compliance, accuracy, consistency, and locality.

📈 Get clear scores

See detailed reports showing your AI's spatial smarts, ready to improve or share.

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 18 to 18 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is GSI-Bench?

GSI-Bench is a Python-based gsi benchmark for exploring spatial intelligence from a generative perspective, accepted to CVPR2026. It tests how well generative models handle 3D spatial tasks in indoor scenes—like moving, rotating, scaling, or removing objects—via edited image outputs. Users get four key metrics: Instruction Compliance, Spatial Accuracy, Appearance Consistency, and Edit Locality, with ready-to-run evaluation scripts after downloading datasets and weights.

Why is it gaining traction?

This gsi technology benchmark stands out by automating 3D-aware evaluation for any generative model, skipping manual annotation hassles common in scene editing tests. Developers plug in their model's outputs (e.g., via simple folder structure and bash eval.sh), get JSON reports across real and synthetic datasets, and even optional MLLM-based scoring. Its transparency—full data pipelines for RoboTHOR and MesaTask scenes—builds trust for reproducible research.

Who should use this?

Computer vision researchers benchmarking generative models on spatial reasoning, especially indoor scene manipulation. Teams building image editors or 3D-aware diffusion models will value the plug-and-play eval for tasks like "move chair 20cm left." Anyone prepping CVPR-style papers on gsi needs its metrics for quick baselines.

Verdict

Try it for spatial intelligence evals—solid docs and tests make setup straightforward despite 18 stars and 1.0% credibility signaling early maturity. Worth watching as CVPR2026 drives adoption.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

297

Followers

Base stars: 18 stars

Penalty: Very new repo (2d): -70%

Bonus: AI verified quality (100%)

Account age: 2,336 days

Repo age: 3 days

License: NOASSERTION

Updated: Apr 25, 2026