lukasHoel

Our method reconstructs 3D worlds from video diffusion models using non-rigid alignment to resolve inherent 3D inconsistencies in the generated sequences.

48
2
100% credibility
Found Mar 18, 2026 at 48 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

This repository implements a method to reconstruct coherent 3D scenes from inconsistent video sequences produced by generative models, using non-rigid alignment and Gaussian Splatting.

How It Works

1
πŸŽ₯ Pick your video

Grab a fun video clip, like one generated by an AI tool showing a moving scene.

2
πŸš€ Start reconstruction

Drop your video into the easy launcher and choose fast or detailed mode.

3
✨ Watch it build

Sit back as it automatically extracts frames, aligns views, and creates a unified 3D world from the shaky video.

4
πŸ” Preview results

Check progress with built-in videos and point clouds along the way.

5
πŸ‘€ Explore in viewer

Launch the interactive viewer to fly around and inspect your new 3D scene from any angle.

πŸŽ‰ Share your 3D world

Export models or videos to show friends your perfect reconstruction from video magic.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 48 to 48 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is video_to_world?

This Python project turns videos generated by diffusion models into coherent 3D worlds, fixing inherent 3D inconsistencies through non-rigid alignment of per-frame depth estimates. Feed it an MP4 from a video diffusion model, and it outputs Gaussian Splatting scenes (2DGS or 3DGS) for novel-view rendering, complete with PLY exports and flythrough videos. Built on DepthAnything-3 for depth, RoMa for matching, and gsplat for splatting, the video to world method handles generated sequences that flicker or warp across frames.

Why is it gaining traction?

Unlike rigid SLAM tools that fail on diffusion outputs, it uses frame-to-model ICP and global optimization to merge inconsistent views into a sharp canonical model, enabling deformation-aware Gaussian training. The one-liner CLI (`run_reconstruction.py --config.input-video video.mp4`) with fast/extensive modes delivers quick 3D results, plus utilities for viewing checkpoints interactively. Developers dig the end-to-end pipeline that resolves alignment issues in AI-generated video worlds without manual tweaks.

Who should use this?

Computer vision engineers prototyping 3D from video diffusion models like Sora or Stable Video, where inherent inconsistencies kill naive reconstruction. Researchers in video to world models evaluating non-rigid methods against baselines. AR/VR devs needing quick 3D assets from synthetic footage.

Verdict

Solid pick for video world GitHub experiments on diffusion-generated inconsistencies, with strong docs, arXiv paper, and modular stages. At 48 stars and 1.0% credibility, it's early but production-ready for evalβ€”test on your clips before betting big.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.