evilsocket

evilsocket / audit

Public

An 8-stage vulnerability-discovery agent.

38
2
69% credibility
Found May 19, 2026 at 287 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

An 8-stage automated vulnerability discovery tool that uses multiple AI agents working together to find security problems in code. It first maps out a codebase, then has many narrow helpers hunt for specific attack patterns, uses different AI helpers to challenge each finding, and finally traces whether attacks could actually reach vulnerable code. The tool produces a structured report of confirmed, reachable security issues. It uses Claude Code subscription billing rather than metered API calls, and explicitly warns users to run audits inside disposable environments when auditing untrusted code.

How It Works

1
📖 You hear about a smarter way to find bugs

You discover a tool based on research from Cloudflare that uses multiple AI helpers working together to find security problems in code, rather than asking one AI to do everything.

2
⚙️ You install the tool on your computer

You run a simple installation command and the tool sets itself up, ready to scan any codebase you point it toward.

3
🔐 You connect your AI subscription

If you already use Claude Code, you're ready to go. Otherwise you generate a special login code that lets the tool use your subscription instead of paying per-query.

4
🎯 You point it at code you want to audit

You tell the tool which folder contains the code you want checked, give the run a name like 'my-audit', and watch it begin its 8-stage investigation.

5
The AI helpers work through 8 stages
🔎
Many helpers work in parallel

The tool sends out 50 helpers at once, each looking for one type of security problem in one part of your code

⚠️
Deliberate disagreement

A separate AI reads every finding and tries to prove it wrong—this catches false positives before they reach your report

🧮
Reachability check

The tool proves whether an attacker could actually reach each bug from the outside, filtering out noise

6
💰 You set a spending limit

You can cap how much the tool spends, so it stops cleanly once you've spent $30 or runs 15 initial tasks—useful for keeping audits affordable.

📋 You get a clean security report

The tool produces a structured report listing every confirmed vulnerability, how severe it is, where it lives in your code, and step-by-step how an attacker could reach it.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 287 to 38 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is audit?

Audit is an automated vulnerability scanner that runs your codebase through an 8-stage pipeline powered by Claude Code's subscription model. Instead of asking one model to "find bugs," it spawns many narrow agents that each hunt for specific attack classes, then runs a second adversarial agent to disprove those findings before ever trusting them. The pipeline maps your repo, generates targeted hunt tasks, validates findings with a different model, deduplicates by root cause, traces whether attacker input can actually reach the vulnerable code, and finally generates a structured report. It runs entirely through a CLI with commands like `audit run`, `audit status`, and `audit report`.

Why is it gaining traction?

The architecture is based on Cloudflare's Project Glasswing research, which found that real vulnerability discovery comes from deliberate disagreement between models rather than one exhaustive scan. The reachability trace is the gating step that separates signal from noise. The subscription billing model means you don't pay per-token, and the auth layer scrubs API keys to prevent silent routing around your subscription. Cost containment flags let you cap concurrency and total spend before running.

Who should use this?

Security teams auditing internal codebases who already pay for Claude Pro or Max. Red teamers who want automated PoC generation for specific attack classes. Developers who want a structured, auditable vulnerability review process rather than ad-hoc LLM queries. Not for production CI/CD yet given the maturity level.

Verdict

At 38 stars with a 0.699999988079071% credibility score, this is early-stage but grounded in serious research. The subscription model keeps costs predictable, but the pipeline can still burn through your Claude quota fast at default concurrency. Worth evaluating if you want systematic vulnerability discovery; treat it as experimental and run it in a sandboxed environment.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.