ant-research

ant-research / Drift

Public

Drift: DLM Reinforcement Learning Training Framework

19
10
100% credibility
Found May 26, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Drift is an open-source reinforcement learning framework designed to improve diffusion language models โ€” a newer type of AI that generates text differently than traditional models. The framework helps researchers and developers train AI models to become better at practical tasks like solving math problems, writing code that passes tests, and solving puzzles like Sudoku. It works by having the AI generate multiple possible answers, evaluating which ones are correct, and using that feedback to gradually improve. Drift supports two popular diffusion models (LLaDA and Dream), offers flexible training options from single-computer experiments to large-scale distributed training, and includes built-in tools to evaluate how well the trained model performs on standard benchmarks.

How It Works

1
๐Ÿ” Discover a Better Way to Train AI

You find Drift while looking for tools to make AI models better at math, coding, and puzzles using reinforcement learning.

2
๐Ÿ“ Prepare Your Training Data

You organize your problems and correct answers into simple JSON files โ€” math problems, coding challenges, or puzzles you want the AI to master.

3
โš™๏ธ Choose Your Settings

You pick a pre-made configuration for math or code tasks, or customize how the AI learns โ€” adjusting how many attempts it gets and how it improves.

4
๐Ÿš€ Launch Training

With one command, your AI begins learning โ€” it generates answers, checks them against correct solutions, and gradually gets better at solving your problems.

5
Choose Your Training Setup
๐Ÿ’ป
Single Computer

Perfect for getting started and experimenting with smaller datasets.

๐ŸŒ
Multiple Computers

Scale up to train on large datasets much faster using many machines together.

6
๐Ÿ“Š Watch Your AI Improve

The system automatically tracks how well your AI is doing, measuring accuracy on math problems and code tests as training progresses.

๐ŸŽ‰ Get a Smarter AI Model

After training finishes, you have an AI that's better at solving math problems, writing code, and tackling puzzles โ€” ready to use or share with others.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 19 to 19 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is Drift?

Drift is a reinforcement learning training framework built specifically for diffusion language models. While most RL training tools target autoregressive transformers, Drift focuses on models that generate text by progressively unmasking tokensโ€”like LLaDA and Dream. It handles the full RLVR (reinforcement learning from verifiable rewards) pipeline: sampling outputs, executing code or verifying math answers as reward signals, and updating the model with policy gradient methods.

The tool runs entirely on PyTorch with DeepSpeed ZeRO-3 for memory-efficient training across multiple GPUs or nodes. Users configure experiments through YAML files, point at pretrained models, and launch training with a single Accelerate command.

Why is not gaining traction?

Diffusion language models are a hot research area, but most teams are stuck using RL tools built for autoregressive models. Drift solves this mismatch directly. It comes with built-in reward functions for code execution and math verification out of the boxโ€”no need to wire up your own test runners. The block-wise parallel decoding is particularly useful for speeding up generation during rollouts, which matters when you're sampling thousands of responses per training step. With support for multiple masking strategies and temperature-based sampling, it's flexible enough to experiment with different denoising schedules.

Who should use this?

NLP researchers working on training or fine-tuning diffusion language models with RL. Specifically: anyone trying to improve code generation on HumanEval/MBPP, improve math reasoning on MATH500, or experiment with RLVR on custom tasks. If you're evaluating diffusion LLMs for production use and need better training tooling, this is worth a look. Academic labs exploring multi-node RL training for large diffusion models will find the distributed setup straightforward.

Verdict

Drift solves a real gap in the ML tooling landscape and comes from Ant Group's research division, which adds credibility. However, with only 19 stars and minimal documentation, expect to do some spelunking to get things running. The framework is functional but clearly early-stageโ€”no polished tutorials, limited examples beyond the config files. If you're an ML practitioner comfortable reading YAML configs and debugging training scripts, it's a solid starting point. For teams needing production-ready stability, wait for more community iteration. The 1.0% credibility score reflects this: credible source, nascent project.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.