cvlab-kaist

cvlab-kaist / GLD

Public

Official implementation of "Repurposing Geometric Foundation Models for Multi-view Diffusion"

85
2
100% credibility
Found Mar 25, 2026 at 85 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Geometric Latent Diffusion is an open-source research project that generates novel views and 3D reconstructions from multi-view images using geometric AI models.

How It Works

1
๐Ÿ” Discover GLD

You find this cool project that turns a few photos into new viewpoints and 3D models.

2
๐Ÿ’ป Get ready

Follow simple steps to prepare your computer so everything runs smoothly.

3
๐Ÿ“ฅ Grab smart models

Download ready-to-use brainpower that understands shapes and scenes.

4
โ–ถ๏ธ Run the demo

Click to launch and watch it create new angles from sample photos.

5
โœจ Magic happens

New views appear and a colorful 3D scene builds itself before your eyes.

6
๐Ÿ‘€ Explore your 3D world

Open the scene in a free viewer and rotate around your creation.

๐ŸŽ‰ You've got 3D superpowers

Now you can recreate scenes from photos anytime with realistic new views.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 85 to 85 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is GLD?

GLD repurposes geometric foundation models like Depth Anything 3 and VGGT for multi-view diffusion, generating consistent novel views from sparse camera inputs without text conditioning. It delivers zero-shot depth maps and 3D geometry straight from latents, solving slow convergence in traditional NVS pipelines. Python repo with pretrained models on Hugging Face, demo script for instant RGB+3D exports via GLB.

Why is it gaining traction?

4.4x faster training than VAE methods, SOTA scores on RealEstate10K/DL3DV benchmarks. One-command demos produce COLMAP-compatible outputs; cascade mode refines coarse-to-fine latents for better geometry. Official GitHub repository with CLI scripts for train/eval, pulling eyes from CV circles tracking goldpreis-like efficiency gains.

Who should use this?

Computer vision researchers benchmarking diffusion for NVS. 3D reconstruction engineers needing geometry priors without custom VAEs. KAIST/Intel-affiliated teams prototyping multi-view synthesis on CUT3R datasets.

Verdict

Solid paper implementation worth forking for experiments (85 stars), but 1.0% credibility signals early-stageโ€”docs solid, tests sparse. Grab official GitHub releases if you've got 48GB VRAM; skip for production until gld_anni stabilizes.

(187 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.