redai-infra / PIPO
PublicImplementation of an efficient LLM architecture: the Pair-In / Pair-Out Model (PIPO)
PIPO (Pair-In, Pair-Out) is an academic research project that makes large language models faster by compressing pairs of tokens during inference. Built by researchers from Chinese universities and Xiaohongshu, it trains AI models to think more efficiently without sacrificing accuracy. The project includes training scripts, evaluation tools for math and coding benchmarks, and works with Qwen3.5 models. Users can download pre-trained checkpoints or train their own models using the provided scripts.
How It Works
You find an academic paper about PIPO, a new technique that makes AI assistants answer questions faster by thinking in compressed pairs.
You download the code and install the tools needed to run the experiments, following simple setup instructions.
You download a Qwen3.5 AI model from HuggingFace - either a smaller 4B version or a larger 9B version.
You run the training script to teach the model to think in compressed token pairs, which is the core innovation of PIPO.
You test the trained model on math problems, coding challenges, and other benchmarks to see how well it performs.
Your model now answers questions up to 2.6× faster while maintaining the same quality - the magic of PIPO compression.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.