ninehills

ninehills / pdf2md

Public

PDF to Markdown OCR tools

42
3
89% credibility
Found May 22, 2026 at 43 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Go
AI Summary

pdf2md is a tool that transforms PDF documents into clean, editable Markdown text. You simply point it at any PDF file, and it uses AI models running in isolated containers to "read" each page and extract the content as properly formatted text. It handles complex layouts, mixed content types, and even equations. The result is a ready-to-use text file plus detailed information about what was converted. Everything runs automatically — the tool downloads the AI models it needs and manages all the technical work behind the scenes, leaving you with just your converted document at the end.

How It Works

1
📄 You have a PDF you want to convert

Maybe it's a research paper, a scanned contract, or an ebook you want to edit — but it's stuck in PDF format and you need it as clean text.

2
🛠️ You check your computer can run it

The tool needs Docker and a graphics card to work, which most modern computers with gaming or AI experience already have.

3
⬇️ You download one small file

The tool comes as a single ready-to-run program for your operating system — no installation wizards, no complicated setup screens.

4
You point it at your PDF with one command

You type the name of the tool and your file, and press enter — that's the whole interaction needed from you.

5
🤖 The AI gets to work behind the scenes

The tool automatically starts up AI services that read each page of your document, understanding text, layout, and even equations.

6
📁 Everything gets organized for you

You receive a complete text version of your document, along with a detailed breakdown of each page and where everything came from.

Your editable document is ready

You now have your content as clean, searchable, editable text — no more retyping or fighting with PDF formatting.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 43 to 42 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is pdf2md?

pdf2md is a command-line tool that converts PDF documents into Markdown using vision language models running locally through Docker. Built in Go, it produces a single binary that handles the entire pipeline: PDF rendering, layout detection, OCR, and Markdown synthesis. It supports three VLM backends (dots-ocr, logics-parsing-v2, and paddleocr-vl-1.5-gguf) with automatic model downloads, outputting per-page images alongside combined Markdown and JSON files.

Why is it gaining traction?

The tool eliminates Python dependencies, CUDA installations, and runtime configuration entirely -- if you have Docker with GPU access, it works. Developers working with research papers, technical documentation, or multi-column layouts find value in the model flexibility: choose dots-ocr for fast layout-aware OCR, logics-parsing-v2 for HTML-structured extraction, or paddleocr-vl-1.5-gguf for a fully local GGUF-based pipeline without API calls. The two-stage PaddleOCR approach (ONNX layout detection feeding a VLM) is particularly interesting for those wanting GPU-accelerated inference without cloud services.

Who should use this?

Researchers converting academic papers to editable Markdown, developers building documentation pipelines, and engineers needing offline PDF processing will get the most from this. Data engineers working with scanned documents and anyone stuck with proprietary PDF formats where copy-paste fails will find it valuable. The GPU requirement is non-negotiable; if your hardware does not support CUDA, look elsewhere.

Verdict

pdf2md delivers where it matters: a minimal Go binary that orchestrates vision models locally, avoiding vendor lock-in and API costs. At 42 stars with a credibility score of 0.8999999761581421%, the project is clearly early-stage -- documentation is sparse and community backing is thin. Early adopters should expect rough edges and potential breaking changes. That said, the architecture is sound, the code is tested, and the approach is pragmatic. If you have GPU hardware and prefer self-hosted solutions, this is worth a serious look.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.