confluence-labs / arc-agi-2

Public

State-of-the-art ARC-AGI-2 solver by Confluence Labs

confluence.sh

100% credibility

Found Feb 25, 2026 at 80 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

Open-source project reproducing a state-of-the-art 97.92% score on the ARC-AGI-2 AI reasoning benchmark using teams of AI agents in secure workspaces.

How It Works

🔍 Discover the puzzle solver

You stumble upon this exciting project that uses smart AI helpers to crack tough brain-teaser puzzles from the ARC-AGI challenge.

📝 Sign up for helpers

You create free accounts with a smart AI service and a secure cloud workspace provider to power your solver.

🔗 Link your services

You add simple private access codes so your AI helpers can think and work safely in protected spaces.

⚙️ Prepare your setup

You download the project files and get everything ready on your computer with a quick preparation step.

▶️ Launch the solver

With one command, you start a team of AI agents tackling all the puzzles at once—it feels magical as they collaborate.

⏳ Watch progress unfold

You monitor live updates as agents refine solutions over loops, building confidence with each update.

📊 Check results and costs

The tool wraps up, shows your puzzle-solving score, total spending, and confirms everything stayed secure.

🎉 Achieve top scores

You celebrate cracking 97.92% of the public puzzles, ready to submit for the ARC Prize with pride!

Sign up to see the full architecture

6 more

Star Growth

See how this repo grew from 80 to 88 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is arc-agi-2?

This Python solver from Confluence Labs tackles the ARC-AGI-2 benchmark, a tough test of abstraction and reasoning for AGI progress, like solving novel arc-agi-2 questions and tests from the arc-agi-2 paper. It spins up parallel Gemini agents in secure E2B sandboxes to generate and refine Python code that transforms input grids into outputs, hitting 97.92% on the public ARC-AGI-2 evaluation set. Users get a one-command run via bash script—set Gemini and E2B API keys, fire it up for submission.json, scores, and cost breakdowns.

Why is it gaining traction?

It saturates the arc-agi-2 benchmark public set, outpacing most on the arc-agi-2 leaderboard with configurable agent counts (up to 12 per input), iteration loops (10 max), and high concurrency (132 sandboxes). Unlike pure RL or custom models, it leverages Gemini 3 preview via CLI for state-of-the-art results without training, plus resume support and partial-result safety nets. Devs dig the transparency: per-task costs, token usage, and readable logs for dissecting agent reasoning.

Who should use this?

AI researchers chasing arc-agi-2 leaderboard spots or arc-agi-2025 evals, especially those benchmarking LLMs on ARC tasks. Experimenters building agentic workflows for grid-based puzzles or poetique-style reasoning. Teams at confluence labs-style outfits prototyping AGI solvers without infra headaches.

Verdict

Grab it if you're deep in ARC-AGI-2—delivers state-of-the-art github results out of the box on public data. With 76 stars and 1.0% credibility score, it's early-stage (solid README, no tests), so validate on privates before production.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

Followers

Base stars: 88 stars

Bonus: AI verified quality (100%)

Account age: 67 days

Repo age: 6 days

License: MIT

Updated: Mar 01, 2026