ZEDA is a research framework from Tsinghua University that transforms already-trained Mixture-of-Experts AI models into faster, more efficient versions. It works by injecting special 'zero experts' (placeholder components that require no computation) and then training the model to use fewer active experts during inference. The process reduces computational costs by over 50% while maintaining most of the original model's capabilities. The project includes complete training scripts, evaluation tools across math/code/instruction benchmarks, and releases adapted versions of popular models like Qwen3 and GLM.
How It Works
A researcher learns about ZEDA - a technique that can make their existing AI models run 50% faster without losing much accuracy.
They read about 'zero experts' - special placeholder components that let the model skip half its work during inference.
They download their trained AI model and prepare 60,000 example prompts for the adaptation process.
They inject zero experts into their model, expanding it into a dynamic version that can choose which parts to activate.
The model learns through two stages: first studying example responses, then practicing on its own outputs.
They run the adapted model through math problems, coding challenges, and instruction-following tests to measure quality.
The model now runs about 1.2× faster with over half the computation eliminated, at only a small accuracy cost.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.