project-89

`Paper repo for “Coherence-Guided Dead-Head Identification in Frozen Transformers,” including manuscript sources, figures, frozen result artifacts, and verification scripts.`

19
2
100% credibility
Found Apr 08, 2026 at 28 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

This project offers a standalone scanner and analysis tools to detect inactive attention heads in transformer models using a universal physics-derived threshold, validated on models from GPT-2 to Llama with high precision.

How It Works

1
🔍 Discover the tool

You hear about a smart way to spot inactive parts in AI language models without any guesswork or tuning.

2
📦 Get ready

You gather the common building blocks needed to run AI model checkups on your computer.

3
🚀 Scan your model

Pick an AI model like GPT-2, launch the checker, and watch it analyze attention connections automatically.

4
📊 See the map

A clear layer-by-layer view appears, showing which parts are dead, alive, or protected.

5
📄 Get your report

Choose a colorful web page summary or data file with pictures and details for deeper insights.

Unlock efficiencies

Now you know exactly which model parts aren't pulling their weight, ready to slim down your AI safely.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 28 to 19 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is coherence-guided-dead-head-identification?

This paper repo delivers a physics-derived, zero-parameter threshold to spot dead attention heads in frozen transformers—no model-specific tuning required. Plug in your Hugging Face causal LM via a Python scanner script, and it outputs per-head coupling scores, dead/alive classifications, layer-by-layer anatomy maps, and self-contained HTML reports with embedded visualizations. It solves the pain of calibrating pruning thresholds across architectures, validated at 95-100% precision on GPT-2, Llama, Qwen, and Gemma families.

Why is it gaining traction?

Unlike magnitude or activation pruning that needs per-model fitting, this github paper template uses a universal formula from coupled-oscillator criticality: tau = 0.96 / sqrt(d_model). Developers dig the instant CLI scans (e.g., --model gpt2 --report anatomy.html) yielding JSON exports and plots, plus frozen artifacts for reproducibility. The paper github link bundles manuscript, figures, and verification scripts, making it a drop-in paper repository for transformer diagnostics.

Who should use this?

ML engineers pruning heads for faster inference on deployed models like Llama or Qwen. Researchers auditing attention redundancy in frozen checkpoints. Optimization teams chasing KV-cache compaction in GQA setups without ablation runs.

Verdict

Solid paper repo for dead-head ID—grab the scanner if you're pruning transformers today. At 19 stars and 1.0% credibility, it's early but reproducible with strong docs and no-fuss verification; pair with your pruning pipeline for quick wins.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.