tiiuae / Falcon-Perception
PublicInference repo for Falcon-Perception and Falcon-OCR model, early-fusion, natively multimodal, dense Autoregressive Transformer models.
Falcon Perception is a vision-language model that detects objects, segments instances, and extracts OCR text from images using natural language queries, with efficient inference engines for GPU and Apple Silicon.
How It Works
You hear about an AI that understands everyday images through simple text descriptions, spotting objects, outlining them precisely, or reading text inside.
Visit the demo page to upload any photo and type what you want to find, like 'the cat on the couch' or 'extract the receipt text'.
In seconds, see colorful outlines around exactly what you described, or the full text pulled out clearly—no setup needed.
Download and launch with a few clicks to process your own photos privately, on GPU or even Apple Mac.
Tweak settings for sharper results, batch many images, or build a web app for friends to use.
Now you effortlessly analyze photos, extract data, or build smart tools—turning vague ideas into precise visuals.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.