deepdiy

deepdiy / pdf2md

Public

A blazing fast, layout-aware PDF to Markdown converter built with Rust. Uses DocLayoutNet YOLO-based detection to preserve document structure — images, tables, formulas, captions, headers and more. Pre-built binaries available for macOS, Linux and Windows. Also offers a free online tool and API at pdf2md.deepdiy.net.

13
1
89% credibility
Found May 18, 2026 at 18 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Rust
AI Summary

pdf2md is a fast, free tool that converts PDF documents into clean Markdown text while preserving the original layout. It uses artificial intelligence to recognize document structure—headings, tables, images, formulas, and captions—so nothing gets lost in translation. You can use it instantly through a free web service, download a ready-made app for your computer, or host it yourself. The tool runs efficiently even on small servers and processes documents about ten times faster than similar converters.

How It Works

1
💡 You need to turn a PDF into editable text

You've got a research paper, report, or document that you want to read as plain text you can edit and share.

2
🔍 The tool understands document structure

Unlike basic converters, this one uses AI to recognize headings, tables, images, and formulas—so nothing gets lost or jumbled.

3
Choose how you want to convert
🌐
Try it online right now

Upload your PDF to the free web tool—no account needed, just drag and drop.

💻
Download the app

Get a ready-made program for your Mac, Windows, or Linux computer. Works offline.

🏠
Run it yourself

Host the web interface on your own machine for private conversions.

4
📄 Drop in your PDF

Select your PDF file and watch as the tool reads through every page, detecting the layout automatically.

5
Conversion happens in seconds

The tool processes your document much faster than other converters, keeping paragraphs intact and preserving images.

You get clean, usable Markdown

Your converted document is ready—download it as a text file or a ZIP with all the images included.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 18 to 13 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is pdf2md?

A Rust-based tool that converts PDFs to Markdown while preserving document structure. It uses a neural network to detect layout elements like tables, images, formulas, captions, and headings, then outputs clean, readable Markdown. You get a standalone binary that runs on any VPS or laptop, no Docker required. There's also a free web API and optional browser-based UI via Streamlit.

Why is it gaining traction?

The combination of Rust performance and layout-aware detection hits a sweet spot. Most PDF converters either destroy complex layouts or require heavy cloud services. Pdf2md ships pre-built binaries for three platforms, so you can self-host on a 1-core VPS. The API requires no signup or API key. The detection classes let you filter unwanted elements like page headers or footnotes from the output. For a tool doing one job, the feature set is surprisingly complete.

Who should use this?

Researchers converting academic papers with tables and equations. Documentation teams migrating content from PDF sources. Developers building pipelines that need machine-readable text from PDFs. Anyone frustrated by existing converters that break tables or scatter text across broken paragraphs. The free API works for light, one-off conversions without installing anything.

Verdict

This is a legitimate option for PDF-to-Markdown conversion with a thoughtful feature set. The credibility score sits at roughly 0.9%, and with only 13 stars, the project is early-stage and under-documented. There is no visible test suite or CI visible in the repository. That said, the Rust implementation is lean, the pre-built binaries reduce friction, and the layout detection solves a real pain point. Worth evaluating for specific use cases but monitor the repository for stability before committing to a production pipeline.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.