jelllott / speechkv-trim
PublicSpeech-aware KV cache pruning for long-form speech LLMs (Qwen2-Audio, SALMONN). Token/head/chunk-level pruners + eval on LibriSpeech-long & GigaSpeech.
Hush KV is a research tool that helps speech AI systems handle long audio recordings more efficiently. When processing audio longer than about 30 seconds, these AI systems normally run out of memory. Hush KV solves this by intelligently deciding which parts of the AI's internal memory to keep and which to discard—similar to how you might take notes instead of remembering every word. The tool offers multiple strategies: some focus on keeping recent information, others score each piece by how important it seems, and one uses a trained helper to distinguish actual speech from silence or filler words. Users can connect different speech models (Qwen2-Audio, SALMONN, Whisper) and test different trimming approaches to find what works best for their needs. The project includes evaluation tools to measure whether trimming hurts accuracy on tasks like transcription and spoken question answering.
How It Works
You need to transcribe or analyze a recording that's several minutes long, but most speech AI tools struggle with anything beyond 30 seconds.
You download and set up Hush KV, a tool that helps speech AI work efficiently with long recordings by intelligently trimming unnecessary data.
You choose from available speech models like Qwen2-Audio, SALMONN, or Whisper depending on whether you need transcription only or full conversation understanding.
Keep recent tokens plus important anchor points like sentence starts and silence boundaries
Let the AI score each piece by how much attention it receives and keep the most important ones
Use a trained helper that identifies which audio parts contain actual speech versus silence or filler words
With one command, you process your long recording and watch as the tool intelligently manages the AI's memory while preserving accuracy.
The AI produces accurate transcription or answers while using far less memory than it would have needed without trimming.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.