TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents
TCOD is a research framework that applies temporal curriculum learning to improve on-policy distillation for training multi-turn AI agents in interactive environments like ALFWorld, ScienceWorld, and WebShop.
How It Works
You find this helpful tool for training smarter AI helpers that chat and act over many turns, like in games or shopping worlds.
Follow simple steps to prepare your computer with the needed programs and connect an AI thinking service.
Download ready-to-use worlds like kitchen adventures or online shops so your AI can practice real tasks.
Pick a training style like steady steps or backward-to-forward to guide your AI helper smoothly from easy to full challenges.
Click to launch and watch your AI helper learn from a wise teacher, getting better turn by turn.
See charts and scores update, showing your AI improving stability and skills on practice tasks.
Your AI helper now handles long conversations better than its teacher, ready for real-world adventures!
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.