AMAP-ML / Omni-WorldBench

Public

A comprehensive benchmark specifically designed to evaluate the interactive response capabilities of world models in 4D settings.

100% credibility

Found Mar 24, 2026 at 64 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

AI Summary

Omni-WorldBench is a benchmark for evaluating the interactive capabilities of AI world models that generate 4D videos, featuring prompt suites, metrics, and leaderboards.

How It Works

🔍 Discover Omni-WorldBench

You stumble upon this exciting new benchmark while searching for ways to test AI models that predict how the world changes over time.

📖 Read the Overview

You learn it's a tool to check how well AI can handle interactions like throwing a ball or a robot picking up chips in videos.

🎥 Watch Demo Videos

You get amazed watching side-by-side videos of different AI models responding to real-life actions and camera movements.

📊 Explore the Leaderboard

You see a chart ranking various AI models on their ability to predict interactive scenes accurately.

📄 Check the Research Paper

You head to the linked article to understand the full details of this evaluation method.

🏆 Advance Your AI Work

Now equipped with this benchmark, you can better test and improve AI that understands dynamic worlds.

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 64 to 64 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is Omni-WorldBench?

Omni-WorldBench is a comprehensive benchmark designed to evaluate the interactive response capabilities of world models in 4D settings, focusing on how actions drive state changes in video generation and 3D reconstruction. It includes Omni-WorldSuite, a prompt suite for diverse interactions like object manipulation and camera trajectories, and Omni-Metrics, an agent-based framework that measures causal impacts on outcomes and trajectories. Developers get leaderboards, quantitative results on 18 models, and video galleries to benchmark their own systems against.

Why is it gaining traction?

Unlike benchmarks stuck on visual fidelity or static 3D metrics, this one targets interactive dynamics head-on, revealing gaps in current video generative models. The plug-and-play prompt suite and automated metrics make it easy to run evals, with pre-computed results providing instant baselines. Early adopters value the arXiv paper's insights into model limitations for 4D world modeling.

Who should use this?

ML engineers building video diffusion models or 4D simulators who need to test action-conditioned generation. Researchers comparing world models like those for robotics sims or autonomous agents. Teams iterating on multimodal LLMs for interactive environments.

Verdict

Grab it if you're in world modeling—solid docs and evals make it a quick win for baselines, despite low 64 stars and 1.0% credibility signaling early days. Wait for code releases if you need production-ready tooling.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

136

Followers

Base stars: 64 stars

Penalty: Very new repo (0d): -70%

Penalty: New repo with many stars: -90% (possible fake)

Bonus: AI verified quality (100%)

Account age: 377 days

Repo age: 0 days

Updated: Mar 24, 2026