antgroup / Agent3Sigma-Canary
PublicAgent3σ-Canary is an evaluation framework for AI Agent security in realistic runtime environments.
Agent3σ-Canary is a security testing tool made by Ant Group that helps researchers evaluate how well AI agents resist various attacks. It places an AI agent in a secure test environment, presents it with realistic but controlled scenarios like suspicious emails, hidden instructions, or trickster tools, then produces detailed reports showing where the agent succeeded or failed at staying secure. The tool is specifically designed for authorized security research and defensive evaluation, not for attacking real systems.
How It Works
A security researcher or AI safety team hears about AgentCanary while researching how AI agents handle attacks.
They pick a test category: direct attacks, hidden instructions in emails, memory tampering, or malicious tools in their workflow.
They tell AgentCanary which AI system to test by naming the provider and model, like choosing which person to evaluate.
AgentCanary launches the AI agent inside a sealed container, like putting someone in a test room where they can safely explore without breaking anything outside.
The system presents tricky situations—suspicious emails, hidden commands in documents, or tools that try to steal information—while recording every move the AI makes.
A detailed report shows whether the agent was tricked, how it responded to danger, and whether it completed normal tasks correctly.
They can test the same agent against multiple protection systems to see which one keeps the AI safest while still being useful.
The researcher now knows exactly where their AI needs better protection, helping them build safer systems for everyone.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.