Visual-AI

[CVPR 2026 Findings] Speed3R: Sparse Feed-forward 3D Reconstruction Models

25
0
100% credibility
Found Mar 10, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Speed3R is an efficient feed-forward model that reconstructs 3D point clouds and camera poses from image sequences or videos using sparse attention for faster inference.

How It Works

1
🔍 Discover Speed3R

You hear about a fun tool that turns everyday videos or photo sets into amazing 3D models in seconds.

2
🚀 Open the demo

Launch the web demo on your computer to start creating 3D scenes right away.

3
📱 Upload your media

Drag in a video clip or a folder of photos, and see them previewed instantly.

4
Reconstruct the 3D

Click the magic button and watch as your media transforms into a colorful 3D point cloud with camera views.

5
🔧 Explore and tweak

Rotate, zoom, adjust details like confidence or cameras to perfect your scene.

🎉 Download your model

Grab your shiny 3D file ready to view, share, or use in any 3D app.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 19 to 25 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is speed3r?

Speed3r is a Python toolkit for feed-forward 3D reconstruction from image sequences or videos, outputting point clouds, per-view local points, confidence maps, and camera poses in one pass. It tackles the quadratic slowdown of dense attention in models like Pi3 by using sparse, keypoint-inspired attention, delivering 12.4x faster inference on 1000-view inputs with minor accuracy trade-offs. Load pretrained weights from Hugging Face, run CLI inference on videos, or spin up a Gradio web demo for interactive results.

Why is it gaining traction?

Unlike dense baselines, speed3r's dual-branch sparse attention focuses compute on informative tokens, slashing latency for large-scale scenes without retraining. Developers grab it for the plug-and-play HF models, video-to-PLY export, and GLB visualization in browsers—perfect for quick prototypes. Ties into CVPR 2026 findings hype, with GitHub repos for cvpr 2026 papers already buzzing on Reddit.

Who should use this?

Computer vision researchers benchmarking SfM alternatives or prepping cvpr 2026 submissions (check dates, deadlines, templates on cvpr 2026 github). Robotics engineers reconstructing environments from drone footage. AR/VR devs turning phone videos into 3D assets for Unity/Three.js pipelines.

Verdict

Early but solid starter (19 stars, 1.0% credibility)—great docs, examples, and Gradio demo make it approachable, though training code and VGGT variant are pending. Track for cvpr 2026 workshops if fast multiview recon is your jam; skip for battle-tested prod until more benchmarks land.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.