[ICLR 2026] PAGE-4D: Disentangled Pose and Geometry Estimation for VGGT-4D Perception
PAGE4D is a feed-forward neural network that estimates camera poses, depth maps, and dense 3D point clouds from multi-view images of dynamic scenes including moving objects.
How It Works
You hear about PAGE4D, a smart tool that turns photos of moving scenes into 3D models with camera positions.
Download the tool and its brain (pre-trained model) so it's all set up on your computer.
Collect a handful of pictures from different angles of something moving, like a person dancing.
Feed your photos into the tool with a few lines of code, and watch it work its magic.
In seconds, you see camera positions, depth layers, and full 3D points of your moving scene appear.
Check out depth maps, 3D points, and camera paths to understand your scene perfectly.
You've got a complete moving 3D model ready for videos, AR, or further analysis!
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.