LMIS-ORG

A project implementing various agentic RL based on the Slime post-training framework

68
0
100% credibility
Found Mar 10, 2026 at 25 stars 3x -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

A framework reproducing Agentic RL methods like AgentFlow on the Slime library to train LLMs for improved tool-use and reasoning on math tasks.

How It Works

1
🔍 Discover smarter AI training

You find a tool to teach AI assistants better reasoning and tool use through practice on math problems.

2
📚 Gather practice problems

Download math challenges and a starting AI model to begin improving its skills.

3
đź”§ Prepare helpers

Set up background services that help the AI think, code, and check its work.

4
🚀 Launch the training

Hit start and watch your AI practice step-by-step planning, tool calls, and self-correction.

5
📊 Track improvements

See scores rise on tough math tests as the AI gets smarter with each round.

🎉 AI solves complex math!

Your trained assistant now handles harder problems confidently with better reasoning.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 25 to 68 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is slime-agentic?

Slime-agentic is a Python project on GitHub that implements agentic RL pipelines using the Slime post-training framework. It lets you train LLMs on multi-turn agent behaviors—like Planner-Executor-Verifier loops for tool use and reasoning—without manual step annotations, applying RL signals like GRPO directly to trajectories. Developers get Docker-ready setups and scripts to run training on math datasets, boosting benchmarks like AIME 2024 from 10% to 26.7% in quick experiments.

Why is it gaining traction?

It hooks seamlessly into Slime's generate and reward functions, turning single-step RLHF into full agent rollouts with minimal boilerplate—ideal for @project github copilot-style tool integration. Scripts handle SGLang inference for planners/executors, low-precision quantization, and eval baselines, making it faster to prototype than from-scratch agentic RL repos. The project GitHub example for Qwen2.5-7B shows plug-and-play model swaps via GitHub Actions or API.

Who should use this?

RL engineers fine-tuning LLMs for math, coding, or tool-use tasks, especially those already on Slime. It's for teams implementing agentic workflows—like project implementing agency partners building reasoning agents—wanting quick wins on datasets like DAPO-Math without custom infra. Suited for project GitHub Python setups testing slime agentic RL before scaling.

Verdict

Grab it if you're in the Slime ecosystem—solid for agentic experiments despite 19 stars and 1.0% credibility score signaling early maturity. Docs cover Docker pulls and eval scripts, but expect tweaks for production; pair with project GitHub repo forks for custom tools. Promising starter for slime agentic RL, not yet battle-tested at scale. (198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.