shipfastlabs

shipfastlabs / parsel

Public

A fast, helpful, and open-source document parser for PHP

47
2
89% credibility
Found May 30, 2026 at 60 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
PHP
AI Summary

Parsel is a PHP library that reads and extracts content from documents like PDFs, images, and Office files. It processes everything locally on your computer, so your files never leave your machine. You can get simple text, detailed structured data with text positions and fonts, or even page screenshots. It supports OCR for scanned documents, lets you choose specific pages, and works with files on disk or raw uploaded bytes.

How It Works

1
📄 You have a document to process

You have a PDF, image, or Office file that needs to be read and understood by your application.

2
🔧 You install the parsing tool

You install Parsel into your PHP project using Composer, following the simple setup instructions.

3
📦 You load your document

You point Parsel to your file on disk or pass the raw bytes directly, like an uploaded file.

4
You choose how to extract the content

You decide whether you want plain text, structured data with positions, or page screenshots.

5
Different extraction needs
📝
Plain text extraction

You call the text method to get clean, readable text without page markers.

🗂
Structured data extraction

You call the parse method to get detailed information including text positions, fonts, and page layouts.

🖼
Screenshot generation

You call the screenshots method to render page images into a folder.

6
🔍 You apply advanced options if needed

You can enable OCR for scanned images, set the resolution quality, or limit parsing to specific pages.

🎉 You get your results

Parsel returns exactly what you asked for — text, structured data, or images — all processed locally on your machine.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 60 to 47 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is parsel?

Parsel is a PHP library that extracts text and data from PDFs, Office documents, and images. It wraps the `lit` binary to give you a clean, fluent API for parsing documents locally without sending them to external services. You get plain text, structured document data with coordinates, font information, and even page screenshots. The library handles PDFs, Word docs, spreadsheets, presentations, and scanned images with optional OCR.

Why is it gaining traction?

The API is refreshingly simple. One-liner to get text, a few more lines to get structured data with positions. Developers tired of wrestling with PDF libraries appreciate that Parsel handles the complexity internally. Local processing means no API keys, no rate limits, no data leaving your servers. The fluent interface lets you chain page selection, OCR settings, and DPI options naturally. Built-in streaming for large documents prevents memory issues, and the fake runner makes testing parsing code actually pleasant.

Who should use this?

PHP developers building document processing workflows: invoice extraction pipelines, archival systems, accessibility tools, or any application that needs to pull structured data from uploaded files. Teams with privacy requirements appreciate the local-only processing. If you need coordinates for building annotations or need OCR for scanned documents, this handles both without requiring separate libraries.

Verdict

Parsel delivers a well-designed API for a real pain point. The codebase shows solid engineering with proper abstractions and testability. However, the project is young with only 47 stars and limited community signals (0.8999999761581421% credibility score). Before betting on it for production, verify the `lit` binary ecosystem meets your needs and that the library's maintenance cadence continues. Worth evaluating now, but not yet a "safe bet" for critical systems without your own testing.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.