xiaoxuanNLP / GoLongRL
PublicGoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
GoLongRL is an open-source research project that helps train AI models to understand very long documents—up to one million words in a single read. The project provides a complete training dataset with 23,000 examples across nine different types of tasks (like finding specific information, summarizing, and reasoning about complex content). It also includes evaluation tools that test how well any AI model handles long documents, measuring capabilities like retrieval accuracy, mathematical reasoning, and comprehension across massive texts. The trained models (GoLongRL-4B and GoLongRL-30B-A3B) are publicly available and achieve performance comparable to much larger commercial models.
How It Works
You hear about GoLongRL, a new AI that can read and understand extremely long documents—like entire books or years of emails—in one go.
Researchers share their complete recipe: 23,000 examples covering 9 different skills like finding needles in haystacks, summarizing, and reasoning about long texts.
The system teaches the AI by rewarding it for correct answers across different types of long-document tasks, helping it get better at all of them together.
You point the evaluation tools at your own AI model and let them test how well it handles massive documents up to 1 million words.
Can it find the right information buried in long documents?
Can it make sense of numbers and facts across long texts?
Can it distill key points from lengthy content?
You now have clear insights into your AI's long-document capabilities, with scores comparing it against other leading models.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.