MADQA is a benchmark with 2,250 human-written questions over 800 PDF documents to test AI agents' reasoning on visual and textual document collections, including evaluation tools and baseline implementations.
How It Works
You find this helpful tool for testing how well AI assistants understand questions from real PDF documents like reports and manuals.
Download ready-made questions and matching PDF files so you can start testing right away.
Choose from simple search helpers or smart visual readers that scan documents for you.
Link a thinking service like ChatGPT or Claude so your assistant can read and reason over pages.
Type a question and watch your assistant search pages, think step-by-step, and pull out answers with sources.
Automatically score how accurate the answers are and see exactly which pages were used.
View your scores next to top methods on the public leaderboard and improve your document reader.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.