FoxNoseTech

FoxNoseTech / diarize

Public

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

16
0
100% credibility
Found Mar 03, 2026 at 16 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

A lightweight Python tool that analyzes audio files to automatically detect and label different speakers and their speaking times.

How It Works

1
🔍 Find the tool

You search online for a simple way to figure out who spoke when in your audio recordings, like meetings or calls.

2
đź’» Set it up

You add this handy audio analyzer to your computer with one easy download step—no extra accounts or fancy hardware needed.

3
🎤 Pick your audio

You select a recording file from your device, like a team meeting or interview.

4
✨ Run the analysis

You start the tool on your file, and it quickly listens through the audio to spot different speakers all on its own.

5
đź“‹ See the timeline

You get a clear list showing each part of the audio and which speaker was talking during those moments.

6
đź’ľ Save your results

You export the speaker info to a simple file to review, share, or use in notes.

🎉 Know who said what

Now you have a perfect breakdown of your recording, making it easy to follow conversations without guessing.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 16 to 16 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is diarize?

Diarize is a Python library that answers "who spoke when?"—the core diarize meaning—in any audio file, segmenting speech by speaker without needing GPUs, API keys, or accounts. Pip install it, call diarize("meeting.wav"), and get timed segments labeled SPEAKER_00, SPEAKER_01, etc., with auto speaker count detection and RTTM export. It handles WAV, MP3, FLAC, and more, processing 8x faster than real-time on CPU.

Why is it gaining traction?

Unlike pyannote's free models requiring Hugging Face tokens and running 7x slower on CPU, this diarizer delivers ~10.8% DER on VoxConverse—edging out speaker diarization 3.1 community tiers—while staying fully local and Apache 2.0. Developers grab it for zero-setup github speaker diarization in scripts, beating multi-speaker github hassles with simple API options like num_speakers=2. Benchmarks and docs make it easy to trust over black-box alternatives.

Who should use this?

Backend devs building meeting transcription pipelines or call analyzers need a drop-in diarizer for batch audio without cloud costs. Podcast processors and voice app makers handling multi-speaker github files—like smart speaker github prototypes or speaker detection github tools—will like the CPU speed for local runs. Researchers tweaking diarized outputs for custom datasets fit too.

Verdict

Grab it if you want pyannote-like results without the token dance—solid for CPU-bound speaker recognition workflows—but with 16 stars and 1.0% credibility, it's beta-stage; validate on your audio beyond VoxConverse. Docs and coverage shine, so prototype fast.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.