LiYu0524

LiYu0524 / ATbench

Public

ATBench: A Diverse and Realistic Agent Trajectory Benchmark for Safety Evaluation and Diagnosis

16
0
100% credibility
Found Apr 12, 2026 at 16 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
AI Summary

ATBench provides benchmark datasets of AI agent interaction trajectories for evaluating safety in long-horizon tool-using scenarios.

How It Works

1
🔍 Discover ATBench

You hear about ATBench, a collection of real-life test stories for checking if AI helpers stay safe while using tools over many steps.

2
📖 Explore the Project Page

You visit the main page to read stories, see example tests, and learn how it helps make AI agents safer.

3
📊 Pick Your Test Set

You choose between the big new set of 1,000 tests or the original 500 for your safety checks – both balanced with safe and risky examples.

4
⬇️ Grab the Test Stories

You easily download the complete interaction stories, each with user requests, AI replies, tool uses, and outcomes.

5
🧪 Test Your AI Helper

You run your AI agent through these multi-turn scenarios to see if it spots dangers and stays on the safe path.

6
🔍 Review Safety Details

You examine results for risks, failure spots, and potential real harms to understand what went wrong or right.

Build Safer AI

With clear insights from the tests, you improve your AI helper to handle tools and long tasks more safely.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 16 to 16 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is ATbench?

ATBench delivers a diverse and realistic benchmark of agent trajectories for safety evaluation and diagnosis in long-horizon, tool-using AI agents. You get two ready-to-load Hugging Face datasets—1,000 trajectories in the main ATBench release (balanced safe/unsafe) and a 500-trajectory legacy version—each with full execution traces of user requests, tool calls, and feedback. Load them in Python via the datasets library to test if your agent stays safe or diagnose failures across risk sources, failure modes, and real-world harms using a structured taxonomy.

Why is it gaining traction?

It stands out with trajectory-level granularity on realistic, multi-turn interactions spanning thousands of tools, making it tougher than prior agent-safety benchmarks where top models drop in performance. The built-in diagnosis taxonomy lets you pinpoint issues beyond binary safe/unsafe labels, and balanced scales with human audits ensure reliable eval. Developers hook on the zero-setup HF access and paper-backed generation for diverse, production-like scenarios.

Who should use this?

AI safety researchers benchmarking guardrails for tool-augmented agents like those in AgentDoG workflows. Teams building diagnostic frameworks for long-context agent failures in enterprise tools. Eval engineers at labs stress-testing models on realistic trajectories before deployment.

Verdict

Grab the HF datasets now if agent safety evals are your focus—solid for quick trajectory benchmarking despite the thin repo (just docs, engine incoming). With 16 stars and 1.0% credibility, it's early but promising; cite the papers and watch for tooling to mature.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.