alibaba-damo-academy / RynnBrain

Public

RynnBrain: Open Embodied Foundation Models

568

100% credibility

Found Feb 13, 2026 at 280 stars 2x -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Jupyter Notebook

AI Summary

RynnBrain is an open-source vision-language model from Alibaba DAMO Academy that excels in egocentric video understanding, precise spatio-temporal localization, physical-space reasoning, and robot task planning.

How It Works

👀 Discover RynnBrain

You hear about this smart AI helper that understands videos from a robot's viewpoint, spotting objects, paths, and planning actions.

🖥️ Try the online demo

Upload a short video or photo and ask simple questions like 'Where's the cup?' to see it think and point instantly.

✨ Watch it shine

The AI draws boxes around objects, traces movements, and suggests next steps, making robot vision feel magical.

📖 Follow fun guides

Open ready-made notebooks that show how to count items, find grasp spots, or plan robot paths step by step.

💻 Get your own copy

Download free models and run them on your computer to test with your own videos.

🔧 Customize for your needs

Tweak it with your robot footage to teach special skills like navigating rooms or picking items.

🤖 Your robot gets smarter

Now your robot sees, understands, and acts in the real world just like you imagined.

Sign up to see the full architecture

5 more

Star Growth

See how this repo grew from 280 to 568 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is RynnBrain?

RynnBrain delivers open embodied foundation models trained on egocentric videos for real-world tasks like robot planning, vision-language navigation, and chain-of-point reasoning. Built on Qwen3-VL bases in sizes from 2B to 30B-A3B MoE, it processes omni-vision inputs to output trajectories, pointing, and actions via a unified encoder-decoder setup. Developers get pretrained weights on Hugging Face, Jupyter Notebook cookbooks for spatial cognition and planning demos, and fine-tuning scripts for custom embodied models.

Why is it gaining traction?

It stands out by grounding language reasoning in physical space—alternating text and localization for precise outputs like grasp poses or paths—beating baselines in embodied QA, counting, and navigation benchmarks. The plug-and-play HF integration, Gradio demo space, and ready-to-run notebooks let devs test video understanding without setup hassle. Early results show it boosts downstream VLAs for manipulation and nav.

Who should use this?

Robotics engineers fine-tuning VLAs for manipulation or household tasks, embodied AI researchers benchmarking spatial grounding, and sim-to-real devs needing navigation models for Habitat or MP3D scenes. Ideal for teams prototyping with egocentric video data who want foundation models over scratch training.

Verdict

Grab it if embodied AI is your jam—models and notebooks make experimentation fast despite 261 stars and 1.0% credibility score signaling early maturity. Docs are solid with performance tables, but expect tweaks for production scaling.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

568

Stars

Forks

779

Followers

Base stars: 568 stars

Bonus: AI verified quality (100%)

Account age: 1,238 days

Repo age: 21 days

License: Apache-2.0

Updated: Mar 02, 2026