allenai

allenai / WildDet3D

Public

Allen Institute for AI: WildDet3D: Scaling Promptable 3D Detection in the Wild

81
6
100% credibility
Found Apr 08, 2026 at 81 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

WildDet3D is an AI system that detects objects in images and estimates their 3D positions and sizes using simple text descriptions, drawn boxes, or clicked points.

How It Works

1
👀 Discover WildDet3D

You stumble upon a cool demo video showing everyday photos turning into 3D scenes with labeled objects.

2
📱 Try the online playground

Head to the free web demo, upload any photo from your phone or computer, and start exploring.

3
Describe what you see

Simply type names like 'chair table car' and instantly watch colorful 3D boxes wrap around every matching object.

4
🎯 Point or draw to guide

Click points on objects or draw quick boxes to tell it exactly what to focus on, like picking out a specific bike.

5
Pick your adventure
🖥️
Web explorer

Upload more photos and experiment with different prompts right in your browser.

📱
Phone app

Download the free iPhone app for real-time detection using your camera anywhere.

6
🕶️ View in 3D or AR

Spin around interactive 3D views or see overlays in augmented reality on your phone.

🚀 Unlock 3D vision

Now you can understand the full 3D shape and position of anything in your photos, ready for fun projects or apps.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 81 to 81 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is WildDet3D?

WildDet3D detects 3D objects in single RGB images using text prompts like "person" or interactive box/point clicks, outputting oriented bounding boxes and depth maps. From the Allen Institute for AI (github allen ai, allen institute for ai), this Python library scales open-vocabulary detection to unstructured "in-the-wild" scenes, handling 800+ categories via SAM3 segmentation and monocular depth like LingBot. Grab pretrained weights from Hugging Face, run inference on your images, and export GLB for AR/VR.

Why is it gaining traction?

It crushes benchmarks like WildDet3D-Bench (41% AP text, 47% box) and Omni3D (up to 46% AP), beating priors on rare objects without custom training. Ready-to-use demos include HF Spaces, real-time iPhone app, Meta Quest VR, and robotics pipelines—zero-shot tracking too. GitHub allen institute repo offers one-liner inference with intrinsics or defaults, plus optional GT depth for precision.

Who should use this?

Robotics engineers building manipulation from phone cameras, AR devs placing objects in live video, or CV researchers prototyping 3D perception in wild datasets like COCO/LVIS. Ideal for allen institute seattle folks or remote teams needing allen institute jobs in detection without LiDAR rigs.

Verdict

Strong pick for promptable 3D detection experiments—inference is polished, docs solid with videos—but at 81 stars and 1.0% credibility score, it's early-stage; await training code for production. From allen institute for ai, worth watching.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.