ali-vilab / ProMoE

Public

[ICLR2026] The official code of "Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance"

arxiv.orgabs2510.24711

100% credibility

Found Feb 06, 2026 at 25 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

ProMoE is a research toolkit for training and testing advanced AI models that generate images using teams of specialized experts within diffusion transformers.

How It Works

🔍 Discover ProMoE

You stumble upon this cool research project that makes AI image generators smarter by teaming up specialized experts.

🛠️ Set up your workspace

You grab the simple tools it needs, like creating a fresh space to play with AI image magic.

🖼️ Gather your picture collection

You download a big folder of labeled photos, like everyday objects, to teach the system what real images look like.

⚡ Speed up prep with smart compression

You optionally squeeze the photos into a compact form so training flies along without wasting time.

🚀 Launch the training adventure

You pick a recipe file and hit go, watching the model learn to dream up stunning images from noise.

🎨 Create your first batch of images

You tell it to whip up thousands of new pictures in different styles, saving them as fun PNG files.

📊 Measure the magic

You run a quick check to see quality scores, comparing how realistic and diverse your creations are.

🎉 Celebrate top-notch results

Your AI now generates images that beat the best, ready for papers, experiments, or just wow-factor fun!

Sign up to see the full architecture

6 more

Star Growth

See how this repo grew from 25 to 30 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is ProMoE?

ProMoE is the official Python code from the ICLR 2026 GitHub repo for scaling Diffusion Transformers with Mixture-of-Experts (MoE) via explicit routing guidance. It trains class-conditional 256x256 image generators on ImageNet, handling VAE latent preprocessing, Rectified Flow training, multi-GPU sampling, and metrics like FID/IS/sFID/Precision/Recall. Developers get ready-to-run configs for ProMoE variants (S/B/L/XL) plus baselines like DiT, DiffMoE, solving why standard MoE flops on vision tokens due to spatial redundancy.

Why is it gaining traction?

Unlike dense DiTs or basic MoE ports from LLMs, ProMoE's two-step router partitions tokens by role (conditional/unconditional) and semantics, beating SOTA on ImageNet under Rectified Flow/DDPM. Baselines in one repo let you compare routing schemes head-to-head, with pre-latent caching and eval scripts slashing setup time. The ICLR 2026 paper proves "routing matters" for vision MoE, hooking diffusion researchers chasing scalable gen without compute explosion.

Who should use this?

Diffusion model researchers tuning MoE for vision tasks like class-conditional ImageNet. ML engineers at labs prototyping scalable DiTs beyond dense limits. Teams forking for custom datasets needing explicit guidance in Python diffusion code.

Verdict

Grab it if you're evaluating ProMoE for ICLR 2026 openreview GitHub experiments—docs cover setup/eval cleanly, but 27 stars and 1.0% credibility signal early maturity; expect tweaks for production. Solid baseline playground, not battle-tested yet.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

1,114

Followers

Base stars: 30 stars

Bonus: AI verified quality (100%)

Account age: 1,105 days

Repo age: 25 days

License: MIT

Updated: Mar 02, 2026