princeton-vl

Official code for Zero-Shot Depth from Defocus (https://arxiv.org/abs/2603.26658)

40
1
100% credibility
Found Mar 31, 2026 at 40 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

FOSSA provides evaluation code and datasets for testing zero-shot depth estimation models from defocus cues in focal stack images.

How It Works

1
📖 Discover FOSSA

You stumble upon this exciting research project from Princeton that estimates depth in photos using focus blur, complete with benchmarks and ready-to-test models.

2
💻 Prepare your workspace

You create a fresh space on your computer to run the tools, making sure everything is set up smoothly.

3
📥 Gather image collections

You download special photo sets with focus stacks and depth info from safe links, organizing them neatly.

4
🔧 Build the blur simulator

You assemble a quick tool that mimics camera blur effects, unlocking realistic defocus testing.

5
🤖 Pick a smart model

You grab a pre-trained brain from the model library to analyze the focus stacks.

6
▶️ Test on photo sets

You launch quick checks on various collections, watching metrics like accuracy scores appear.

🏆 Join the leaderboard

Your depth estimates shine on the benchmark board, ready to share results or submit for ranking.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 40 to 40 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is FOSSA?

FOSSA estimates metric depth maps from focal stacks—sets of images taken at different focus distances—using zero-shot inference in Python with PyTorch. It leverages defocus blur cues to outperform monocular methods on datasets like ZEDD, Infinigen Defocus, and DDFF12, delivering predictions via pretrained ViT-S or ViT-B models loaded from Hugging Face. Run validation with a single bash command like `dist_val.sh --encoder vits --resumed_from fossa-vits`, and submit to the ZEDD test server for leaderboard scores.

Why is it gaining traction?

Reproduces paper results on defocus benchmarks with minimal setup, including auto-downloaded datasets from Hugging Face and CUDA-accelerated PSF rendering for synthetic stacks. FOSSA github integration shines for quick eval on real focal stacks, beating baselines like Depth Anything on HAMMER and iBims without fine-tuning.

Who should use this?

CV researchers testing defocus depth on robotics or microscopy rigs with multi-focus cameras. AR/VR devs needing precise near-field depth from consumer lens stacks, like fossa pterygopalatina-inspired close-up sensing.

Verdict

Strong academic eval tool from Princeton VL, but 40 stars and 1.0% credibility score signal early maturity—docs are solid, yet training code awaits 2026. Use HF models now for focal stack depth; skip if seeking production polish.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.