kjeiun

kjeiun / amuse

Public

AMUSE optimizer implementation

19
0
100% credibility
Found May 27, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

AMUSE is an academic research project that implements a new optimization algorithm for training deep learning models. It combines two existing techniques—Muon's rapid progress with Schedule-Free's stability—to create an optimizer that works well for both image classification and language model training. The project provides ready-to-use scripts for common AI training tasks, along with implementations of various other optimizers for comparison. It's designed for researchers and practitioners who want to train neural networks more efficiently.

How It Works

1
📚 You discover AMUSE through research

You come across a new optimization method for training AI models that promises faster results with less effort.

2
🔍 You explore the documentation

You read through the clear guides showing how AMUSE works on both image recognition and language tasks.

3
🛠️ You set up your environment

You install the required tools and prepare your computer for training AI models.

4
You choose your training task
🖼️
Image Classification

Work with datasets like CIFAR or ImageNet to recognize pictures

📝
Language Model

Train models to understand and generate text

5
Your model trains with AMUSE

The optimizer smoothly trains your model without needing complex learning rate schedules.

6
📊 You watch the progress

Training metrics appear in real-time, showing your model improving faster than with older methods.

🎉 You achieve better results

Your trained model performs well, and you got there more efficiently than using traditional optimizers.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 19 to 19 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is amuse?

AMUSE is a Python optimizer for deep learning that combines two recent advances: Muon's Newton-Schulz orthogonalization for matrix parameters, and Schedule-Free averaging. The implementation handles both image classification and language model pretraining, with ready-to-use scripts for CIFAR, ImageNet, and FineWeb datasets. It requires no learning rate schedules -- you set a base learning rate and it trains to completion without decay. The key trick is a time-varying interpolation that starts near Muon's fast trajectory for quick adaptation, then gradually shifts toward a stable averaged sequence to reduce oscillations.

Why is it gaining traction?

The optimizer claims to improve the performance-iteration Pareto frontier over both AdamW and Muon individually. For practitioners, the main appeal is schedule-free operation -- no babysitting learning rate schedules, no cooldown phases, and the ability to stop training at any checkpoint without performance degradation. The paper provides a theoretical lens (river-valley loss landscapes) explaining why Muon's orthogonalization can cause oscillations and how AMUSE addresses them. The repository bundles comparisons against 15+ other optimizers including Lion, Prodigy, Sophia, and SOAP, making it a useful benchmark harness even if you don't use AMUSE itself.

Who should use this?

ML researchers comparing optimizer performance will find the most value here -- the codebase is structured as a benchmark suite with consistent logging and evaluation across optimizers. LLM pretraining practitioners working with FineWeb or similar datasets may want to experiment with AMUSE as a drop-in replacement for AdamW with Schedule-Free. For production use cases, the 19-star count and recent publication date mean this is still early-stage; teams should validate against their specific workloads before committing.

Verdict

The 1.0% credibility score reflects a new, low-visibility project with limited community validation. The academic backing (arXiv paper) adds legitimacy, but the lack of tests and minimal documentation make it risky for production systems. If you're running optimizer comparisons for research, this is worth a look. For production training pipelines, wait for broader adoption and third-party benchmarks.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.