zai-org

zai-org / GLM-OCR

Public

GLM-OCR: Accurate × Fast × Comprehensive

1,825
127
100% credibility
Found Feb 03, 2026 at 362 stars 5x -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

GLM-OCR is an AI-powered OCR web app and SDK that extracts structured text, tables, formulas, and layouts from complex documents like PDFs and scans.

How It Works

1
🔍 Discover smart document reader

You find GLM-OCR, a friendly tool that reads complex papers, scans, and PDFs turning messy layouts into clean editable text.

2
📥 Get the app ready

Download and start the web app with simple clicks, no tech hassle needed.

3
📤 Upload your document

Drag your PDF or image file into the app and watch it load instantly.

4
See live magic

The app reads your document, showing a preview on one side and neat text results on the other with clickable highlights.

5
🔍 Click to explore

Tap any text block to zoom right to its spot in the preview, making verification fun and easy.

6
💾 Grab your results

Copy the clean text or download as Markdown file, ready for editing or sharing.

🎉 Perfect extraction done

Your document is now structured and readable, saving hours of manual work.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 362 to 1,825 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is GLM-OCR?

GLM-OCR is a Python SDK delivering accurate, fast, comprehensive OCR powered by GLM models like GLM 4.5 OCR and GLM 4.6 OCR variants. It extracts text, tables, formulas, and layouts from images or PDFs, outputting clean Markdown or structured JSON via simple CLI commands like `glmocr parse file.png` or Python API calls. Developers get robust document parsing without managing low-level vision models.

Why is it gaining traction?

It tops benchmarks like OmniDocBench at 94.62 while running efficiently on 0.9B params via vLLM, SGLang, or Ollama—far snappier than heavier alternatives. The SDK wraps cloud APIs for zero-setup starts or self-hosted servers, plus Flask endpoints for easy integration. No fluff: one-line installs yield production-ready GLM OCR results.

Who should use this?

Backend engineers building doc pipelines for invoices or reports, AI devs parsing code-heavy PDFs, or teams needing reliable table/formula extraction in Python apps. Ideal for replacing PaddleOCR when accuracy on complex layouts matters.

Verdict

Grab it for cutting-edge GLM OCR if you need speed and precision—827 stars reflect real dev interest despite 1.0% credibility score signaling early maturity. Docs shine, but alpha status means test edge cases before scaling.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.