RoboTwin-Platform / RMBench

Public

Memory-Dependent Manipulation Benchmark based on RoboTwin

rmbench.github.io

100% credibility

Found Mar 06, 2026 at 44 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

RMBench is a benchmark for evaluating robotic manipulation policies that depend on memory of past observations and actions in simulated dual-arm environments.

How It Works

🔍 Discover RMBench

You stumble upon RMBench, a fun benchmark for testing how well robot brains remember and handle tricky object tasks like stacking blocks or pressing buttons.

💻 Set up your workspace

Follow simple steps to prepare your computer, like creating a new space and grabbing the starter files.

📥 Grab ready scenes

Download example robot worlds, objects, and task videos to see demonstrations in action.

🤖 Watch robots solve puzzles

Run sample robots as they carefully cover blocks, swap items, or press buttons, learning from memory of what they've seen.

🧠 Test or train robot smarts

Try out smart policies or train your own robot brain to remember and manipulate objects better.

✅ Master memory challenges

Your robot excels at tasks needing memory, like rearranging or inserting pieces perfectly every time!

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 44 to 44 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is RMBench?

RMBench is a Python benchmark for testing memory-dependent robotic manipulation policies on the RoboTwin platform, evaluating how agents remember and adapt to dynamic scenes like block rearrangements or battery insertions. It provides standardized tasks, demo data from Hugging Face, and scripts to collect trajectories, run evals, and deploy policies such as ACT or DexVLA. Developers get a drop-in suite to measure policy robustness beyond single-shot actions, echoing HarmBench-style safety probes but for robotics.

Why is it gaining traction?

Unlike generic manipulation benchmarks, RMBench stresses memory across episodes with unseen instructions and object variations, revealing policy weaknesses in long-horizon reasoning. Quick setup via conda envs, Hugging Face asset downloads, and RoboTwin compatibility lets users transfer policies seamlessly without rewriting code. The associated HarmBench paper and arXiv preprint draw robotics folks hunting reproducible, memory-focused evals.

Who should use this?

Robotics researchers benchmarking VLAs on memory-intensive tasks like sequential covering or button-pressing sequences. ML engineers at labs extending RoboTwin data for training adaptive manipulators. Teams probing policy generalization to novel scenes without building custom sims from scratch.

Verdict

Promising for memory-dependent manipulation benchmarking, especially if you're on RoboTwin, but 44 stars and 1.0% credibility score signal early maturity—docs are README-focused with community links, so expect some setup tweaks. Prototype it for research; production users should monitor updates.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

Followers

Base stars: 44 stars

Bonus: AI verified quality (100%)

Account age: 265 days

Repo age: 12 days

License: MIT

Updated: Mar 06, 2026