AutoSAE is an open-source experiment demonstrating an AI agent autonomously iterating on sparse autoencoder designs to achieve high-fidelity reconstruction of Gemma 3 1B model activations.
How It Works
You stumble upon AutoSAE, a clever experiment where an AI guides itself to build better tools for peeking inside language models' thoughts.
You grab the one essential helper tool to start working with AI inner signals.
You collect and save the language model's hidden activity patterns from sample texts, ready for analysis.
You launch the training, letting it create smart detectors that capture what the model notices in text, running fast on your computer's power.
You create charts showing how much better the detectors get with each try, from weak to almost perfect.
You peek at what each detector lights up for, seeing patterns like topics or ideas in the text.
You now have powerful detectors revealing nearly all the model's thinking patterns, ready for deeper AI understanding.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.