dreadnode

Agent observability and replay tooling for AI safety & interpretability research.

17
1
100% credibility
Found Mar 19, 2026 at 17 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

AgentLens is a research tool for running and analyzing multi-session interactions with AI coding agents, capturing detailed conversation logs, file changes, and enabling behavioral experiments like resampling.

How It Works

1
📖 Discover AgentLens

You hear about AgentLens, a helpful tool for watching how AI coding helpers behave over multiple chats and file edits.

2
🛠️ Set it up quickly

Download and prepare it on your computer using everyday tools, no hassle.

3
🔗 Link your AI helper

Connect a smart AI service so the agent can think and act like a real coding partner.

4
🎯 Plan and launch sessions

Describe simple tasks for the agent in a friendly note, then watch it explore code, take notes, and make changes across chats.

5
👀 Browse chat logs and changes

Open the web viewer to see full conversations, thinking steps, tool uses, and exactly what files changed when.

6
🔄 Test variations

Replay moments, tweak inputs, or rerun chats to spot how the agent behaves differently each time.

Unlock agent secrets

You gain clear insights into how AI agents think, adapt, and edit code over time, perfect for your research.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 17 to 17 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is agent-lens?

AgentLens is a Python harness for running multi-session Claude AI agent workflows on codebases or directories, capturing trajectories in the ATIF standard for agent framework.observability. It chains, forks, or isolates sessions via simple YAML configs, tracks file changes invisibly with shadow git diffs, and outputs structured JSON plus a Svelte web UI for replaying turns or resampling API calls to probe behavioral variance. Developers get CLI tools like `harness run`, `resample`, and `replay` for agent observability and evaluation without manual logging.

Why is it gaining traction?

It stands out as an agent observability open source tool tailored for AI safety research, with built-in resampling for intervention testing—edit prompts, tool results, or system messages and rerun turns to spot patterns like hedging. The web UI shines for agent lens ai inspection: side-by-side resamples, memory diffs, subagent views, and changelogs, all from agent github claude sessions routed via Anthropic, OpenRouter, Bedrock, or Vertex. No other agent github repo nails multi-session forking and turn-level replay this cleanly for interpretability.

Who should use this?

AI alignment researchers testing agent github copilot reddit hypotheses across chained sessions, or interpretability folks studying agent observability aws/livekit variance in subagents. Ideal for devs prototyping agent github action workflows or agent github microsoft/openai hybrids needing precise trajectory capture and edit-resample loops, especially on Claude models.

Verdict

Grab it if you're deep in Claude agent observability platform work—docs are sharp, CLI intuitive, ATIF integration solid. At 17 stars and 1.0% credibility, it's experimental (turn replay flagged as beta), so expect bugs; contribute PRs to mature it. Strong start for niche agent lens ai tooling.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.