vorojar

vorojar / Folio-OCR

Public

Open-source batch OCR workbench — a free, local alternative to ABBYY FineReader. Powered by Ollama + GLM-OCR + PP-DocLayoutV3, ~0.5s/page on RTX 4090. Three-panel editor, layout-aware, PDF/image batch processing, Markdown/Word export. 批量OCR工作台,纯本地运行,免费平替ABBYY,适合书籍文档数字化。

67
8
100% credibility
Found Feb 17, 2026 at 48 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
JavaScript
AI Summary

Folio-OCR is a browser-based workbench that turns scanned documents and images into editable, searchable text using AI recognition, with tools for proofreading and exporting.

How It Works

1
🌐 Open the app

You visit the Folio-OCR webpage in your browser and see a simple upload area ready for your documents.

2
📤 Upload your files

Drag and drop your PDF or images, and watch as the app quickly splits them into pages you can browse.

3
🧠 Prepare the reader

Click to wake up the smart reader, and it gets ready to understand your pages in moments.

4
Recognize the text

Hit the button to process all pages, and see the magic as handwritten or printed text turns into editable words with highlights showing where it read.

5
✏️ Review and tweak

Flip through pages, edit any mistakes, search for words, or smooth out the flow to make it perfect.

📄 Save your work

Export everything as a clean document file, ready to read or share, with your changes preserved.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 48 to 67 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is Folio-OCR?

Folio-OCR is a local web app for batch OCR on PDFs and images, turning scanned books into editable Markdown or DOCX with blazing speed—0.5 seconds per page on a 4090 GPU notebook. Built with a JavaScript frontend and Python backend, it leverages Ollama for GLM-OCR inference and a layout model to detect text blocks, skipping headers and footers automatically. Users drag-and-drop files for instant page previews, OCR results, and exports.

Why is it gaining traction?

It stands out with a three-panel UI: thumbnails on left, image preview with clickable bounding boxes linking to editable text regions on right, plus full-text search and paragraph reflow for clean CJK output. Batch processing with progress tracking and background pre-OCR feels snappy, and Docker Compose setup gets Ollama + GPU support running in minutes—no cloud costs or API limits. Developers dig the interactive proofreading workflow over clunky CLI tools.

Who should use this?

Document digitizers scanning academic papers or books in Chinese/English, researchers extracting tables from PDFs without manual tweaks, or indie devs prototyping local OCR pipelines. Ideal for anyone with a decent GPU tired of Tesseract's accuracy issues on complex layouts.

Verdict

Try Folio-OCR if you need fast, local folio OCR in a polished UI—it's a solid prototype despite 47 stars and 1.0% credibility score signaling early maturity with minimal docs. Pair it with Ollama pulls for immediate wins, but watch for stability in production.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.