thunlp / hybrid-linear-attention
PublicCode and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts
This repository offers code and models for training hybrid attention architectures that excel at extremely long contexts through efficient distillation techniques.
How It Works
You hear about a breakthrough way to make AI handle super long stories and conversations without slowing down.
Download the ready-made guides and sample setups so your computer is prepared in minutes.
Follow the simple three-step process to blend smart attention layers and create your efficient super model.
Try your new model on huge documents and see it remember everything perfectly.
Celebrate as your AI now processes endless inputs lightning-fast with top-notch smarts!
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.