opendatalab / MinerU-Diffusion
PublicA diffusion-based framework for document OCR that replaces autoregressive decoding with block-level parallel diffusion decoding. Topics
MinerU-Diffusion is an open-source AI system that extracts structured text, tables, formulas, and layouts from document images using efficient diffusion-based decoding.
How It Works
You hear about a smart tool that turns photos of documents into clean, editable text, tables, and formulas.
You get everything ready on your computer so the tool can work with your document images.
You grab the special knowledge files that let the tool understand documents.
You pick a picture of a page—like a scanned report or book—and the tool gets excited to read it.
You tell it what to find: full page structure, plain text, tables, or math formulas.
The tool scans the image and pulls out all the content in seconds, feeling fast and reliable.
You get neat, structured text ready to edit, copy, or use anywhere, saving hours of manual work.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.