andraiming

A side-by-side benchmarking playground for discrete speech tokenizers (EnCodec, HuBERT-units, SpeechTokenizer, etc.).

45
0
89% credibility
Found May 26, 2026 at 45 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Speech Tokenizer Arena is a benchmarking tool that compares different speech compression technologies by running the same audio through multiple methods and measuring quality, efficiency, and speech recognition accuracy to help researchers choose the best option for their needs.

How It Works

1
🎯 You want to compare speech compression methods

You've collected audio files and need to find the best way to compress them while keeping quality high.

2
📦 You install the comparison tool

You download Speech Tokenizer Arena, a tool that tests different compression methods side-by-side on your audio.

3
🎛️ You pick which methods to test

You choose from options like EnCodec, DAC, SpeechTokenizer, or HuBERT units based on what you want to compare.

4
You run the comparison and watch it work

The tool plays your audio through each compression method, measures quality, and tracks how much data each one uses.

5
You choose how to view your results
📄
Read the summary table

See all methods compared in one easy chart with scores for quality, speed, and data usage

📈
Look at detailed charts

View bar graphs and spectrograms showing exactly how each method performed on your audio

🏆 You know which method is best for you

With clear numbers in hand, you can confidently choose the compression method that fits your quality needs and data budget.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 45 to 45 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is speech-tokenizer-arena?

Speech Tokenizer Arena is a benchmarking harness for discrete speech tokenizers like EnCodec, DAC, SpeechTokenizer, and HuBERT units. Drop in a YAML config, point it at your audio data, and get a side-by-side markdown leaderboard comparing reconstruction quality, bitrate, and downstream ASR performance. Built in Python on PyTorch, it gives you mel-SD, SI-SDR, STOI, PESQ, and Whisper-based WER metrics out of the box. The CLI command `sta run --config configs/all.yaml` kicks off a full evaluation sweep with one line.

Why is it gaining traction?

The speech tokenizer space is fragmented—every paper uses different datasets, bitrates, and metrics, making honest comparison nearly impossible. This project forces tokenizers through the same gauntlet on identical audio, producing comparable numbers you can drop into a paper appendix. The YAML-driven config system means adding new tokenizers or swapping datasets requires zero code changes. It handles the boring parts (resampling, metric aggregation, report generation) so you can focus on interpreting results rather than wiring up experiments.

Who should use this?

Researchers comparing speech tokenizers for downstream tasks will find the most value here. If you're building a language model that uses discrete speech codes, use this to pick the right tokenizer for your bitrate budget. Audio compression researchers can use it as a standardized evaluation harness. Developers integrating speech tokenization into products should run their candidates through the arena before committing.

Verdict

This is a useful, focused tool that fills a real gap in the speech ML ecosystem. The 0.8999999761581421% credibility score reflects a small but active codebase with clear documentation and MIT licensing. At 45 stars, it's early-stage but production-ready for benchmarking workflows. The main caveat: HuBERT units are encoder-only, so decode-side metrics are unavailable for that tokenizer until someone adds a unit vocoder. If you're evaluating speech tokenizers today, run your candidates here before trusting a paper's self-reported numbers.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.