biao994

biao994 / DocPaws

Public

工程化 RAG 文档助手:知识库、PDF 索引、Agent 工具编排、scope 检索、引用溯源与拒答阈值。FastAPI + Vue3

55
1
85% credibility
Found May 31, 2026 at 55 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

DocPaws is an enterprise-grade document assistant that lets you upload PDF files, organize them in folders, and then ask questions in plain language. The app reads your documents, builds a searchable knowledge base, and answers your questions by finding the relevant passages. You can choose between a quick answer mode and a deep analysis mode that shows the AI's reasoning. Every answer includes citations pointing to the exact pages it came from, so you always know the source.

How It Works

1
📄 You upload your PDF documents

You drag and drop your PDF files into the app, and they get organized into folders automatically.

2
🔍 The app reads and understands your documents

Behind the scenes, the app reads through every page of your PDFs and creates a searchable knowledge base.

3
💬 You ask questions in plain language

You type a question like 'What was the revenue last quarter?' and the app finds the answer directly from your documents.

4
Choose how detailed you want your answer
Quick mode

Get a fast, direct answer from your documents.

🧠
Deep mode

See the AI's thinking process and detailed reasoning before the final answer.

5
📎 You see where every answer comes from

Each answer shows the exact page and document it came from, so you can always verify and trust the results.

You get accurate answers instantly

Instead of searching through hundreds of pages yourself, the app instantly pulls the right information and shows you exactly where it found it.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 55 to 55 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is DocPaws?

DocPaws is an enterprise-grade RAG document assistant that lets you upload PDFs, build searchable knowledge bases, and chat with your documents using AI. The Python backend runs FastAPI with SQLModel for the database layer, while a Vue 3 frontend handles the UI. It handles the full pipeline: PDF parsing, chunking, vector indexing with FAISS, and retrieval-augmented generation with citation tracking. The system supports configurable L2 distance thresholds to automatically reject answers when retrieval quality is too low.

Why is it gaining traction?

The project solves several pain points that plague RAG implementations. Its manifest-based incremental indexing means you do not rebuild the entire vector store when uploading a new PDF, just the changed pieces. Scope control lets users query across a whole knowledge base, a specific folder, or a single document, which is useful for large document collections. The retrieval distance threshold feature is particularly clever: it refuses to answer if the retrieved chunks are too semantically distant, reducing hallucinations. The built-in Golden 20 evaluation framework lets teams benchmark their RAG pipelines without external tooling.

Who should use this?

Backend developers building internal document Q&A tools will find the FastAPI structure familiar and extensible. Teams managing large PDF collections who need granular access control at the folder or document level will benefit from the scope filtering. Organizations already using DeepSeek or OpenAI-compatible APIs will appreciate the minimal configuration required to get started. The evaluation module is valuable for ML teams that need reproducible RAG benchmarking.

Verdict

DocPaws is a well-structured, feature-complete RAG starter kit with a credibility score of 0.8500000238418579%. At 55 stars, it is early-stage but shows thoughtful engineering across the indexing pipeline and agent tooling. The layered architecture (API, domain, infra, usecases) is clean enough to extend. Test coverage exists (79+ pytest cases) but production hardening requires replacing the default SECRET_KEY and reviewing FAISS index permissions. Worth evaluating as a foundation for document Q&A systems, though teams should budget time for operational setup of Redis and Celery workers.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.