AI45Lab / DeepScan

Public

Diagnostic Framework for LLMs and MLLMs

ai45.shlab.org.cndeepscan ai diagnostics dignosis evaluation framework

100% credibility

Found Feb 10, 2026 at 23 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

DeepScan is an open-source framework for diagnosing large language models and multimodal models using specialized evaluators to analyze representations, safety boundaries, and reasoning processes.

How It Works

🔍 Discover DeepScan

You hear about DeepScan, a helpful tool for checking how AI models think and spot safety issues.

📦 Get it ready

Download and set up DeepScan on your computer in just a few minutes.

🧠 Pick your AI

Choose an AI model like Qwen or Llama, and some example conversations to test.

📋 Plan your check

Select a test type like TELLME for clear thinking or X-Boundary for safety zones.

🚀 Run the diagnosis

Hit go and watch DeepScan analyze your AI's inner workings automatically.

📊 Review results

Get easy-to-read reports, charts, and scores showing strengths and weak spots.

✅ Understand your AI

Now you know exactly how safe and smart your AI is, ready to improve or share findings.

Sign up to see the full architecture

5 more

Star Growth

See how this repo grew from 23 to 31 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is DeepScan?

DeepScan is a Python diagnostic framework for probing large language models and multimodal LLMs, much like a car diagnostic github tool scans engines for hidden faults. It lets you register models like Qwen or Llama, load datasets via YAML configs, run specialized evaluators for issues like concept disentanglement or safety boundaries, and summarize results—all through a simple CLI or Python API. Developers get quick insights into model internals without building pipelines from scratch.

Why is it gaining traction?

It stands out with pre-registered support for major families (Qwen, Llama, Mistral, Gemma), plug-and-play evaluators like TELLME for representation gaps or X-Boundary for safe/harmful geometries, and no-code CLI runs like `python -m deepscan.run --config examples/config.tellme.yaml`. The extensible registry system and automatic leaderboards make it a drop-in replacement for ad-hoc scripts, saving hours on setup while enabling custom diagnostics akin to deepscan ai for mental health or entrepreneurial practices.

Who should use this?

AI safety researchers benchmarking refusal behaviors, LLM fine-tuners debugging entanglement in reasoning paths, or eval teams at startups assessing models before deployment. Ideal for those tired of stitching together Transformers scripts for deepscan-style diagnostics on jaundice-like edge cases or boundary safety.

Verdict

Worth trying for structured LLM diagnosis if you're in AI safety—solid docs, CLI, and examples make it accessible despite 23 stars and 1.0% credibility signaling early alpha status. Pair with its DeepSafe sister for full pipelines, but expect to contribute as it matures.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

Followers

Base stars: 31 stars

Bonus: AI verified quality (100%)

Account age: 416 days

Repo age: 28 days

License: NOASSERTION

Updated: Feb 26, 2026