trevin-creator

Apple Silicon (MLX) port of Karpathy's autoresearch — autonomous AI research loops on Mac, no PyTorch required.

686
121
100% credibility
Found Mar 09, 2026 at 376 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

An Apple Silicon Mac version of an autonomous research tool where an AI agent iteratively improves tiny language model training setups within fixed short time budgets.

How It Works

1
🔍 Find the magic Mac tool

You stumble upon this fun project that lets AI automatically get better at building smart language models right on your Apple Mac.

2
💻 Make sure your Mac is ready

Check that you have a newer Apple Mac with the right basic software to get started easily.

3
📥 Grab the files and prep data

Download everything and set up the learning materials with a simple one-time step so it's all ready to go.

4
🚀 Try your first quick test

Launch a short practice run to see how it trains a mini model and measures how well it does.

5
🤖 Hand it to an AI helper

Point a smart AI assistant at the guide notes, and it starts changing things, testing, and picking the best ones automatically.

6
Let it work while you rest

Set it loose overnight so it can try dozens of ideas on its own without you lifting a finger.

🎉 Celebrate smarter results

Wake up to a bunch of improved setups with better scores, showing your model learned way more efficiently.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 376 to 686 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is autoresearch-mlx?

This Python project ports Karpathy's autoresearch to Apple Silicon Macs using MLX, enabling autonomous AI research loops for language model training without PyTorch or CUDA. Developers point an AI agent like Claude at a config file, and it iteratively tweaks hyperparameters, runs 5-minute training experiments, evaluates bits-per-byte loss, and commits improvements via git—all natively on M-series chips. You get overnight optimization of tiny models on climbmix data, hitting val_bpb under 1.3 on M4 Max.

Why is it gaining traction?

It stands out by ditching GPU dependencies for unified memory on Apple Silicon Macs, delivering 8-9 experiments per hour versus slower PyTorch setups on Intel or even Apple Silicon Linux ports. The hook is true autonomy: agents discover hardware-specific optima like smaller depths and batches that maximize steps in fixed budgets, outperforming baselines by 20%+ on M1 to M4 hardware. No containerization hassles—just uv sync, prepare data once, and run.

Who should use this?

ML hobbyists and researchers on Apple Silicon Macs experimenting with nano-scale pretraining, especially those benchmarked against H100 results but stuck without CUDA. Indie devs in Apple Silicon Valley tuning autonomous agents for hyperparameter search, or teams evaluating Apple Silicon vs Intel for edge training without cloud costs.

Verdict

Grab it if you have an Apple Silicon Mac and want agent-driven LLM tinkering—350 stars show early buzz, solid README with results tables, but 1.0% credibility flags it as experimental with no tests. Solid for quick prototypes; fork and contribute for production.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.