Gr122lyBr

Gr122lyBr / voicetag

Public

Speaker identification powered by pyannote and resemblyzer

15
1
100% credibility
Found Mar 17, 2026 at 15 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

voicetag is a Python library for automatically identifying speakers in audio files and optionally transcribing their speech with timestamps.

How It Works

1
🔍 Discover voicetag

You find voicetag, a handy tool that figures out who is speaking in your audio recordings like meetings, podcasts, or calls.

2
📥 Set it up

You easily add voicetag to your computer so it's ready to use.

3
Get free access

You visit a website to agree to simple terms and grab a free pass to unlock the voice listening features.

4
👥 Teach voices

You name people and share a few short clips of their voices so voicetag recognizes them later.

5
Pick your goal
👂
Spot speakers

Find out exactly when each person talked and how sure it is.

💬
Get full words

Turn speech into text, labeled by who said it.

6
▶️ Run on recording

You select your audio file, and voicetag quickly processes it to label the speakers.

Enjoy clear results

You get a neat timeline showing who spoke when, with overlaps noted, and transcripts if you chose, making your audio easy to follow.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 15 to 15 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is voicetag?

voicetag is a Python library for speaker identification and diarization in audio files, answering "who said what, when" in meetings, podcasts, or calls. Enroll known speakers with a few samples via a simple API or CLI, then process any recording to get timed segments with names, unknowns, or overlaps. It integrates pyannote for segmentation, resemblyzer for voice matching, and optional transcription via OpenAI, Groq, Deepgram, Fireworks, or local Whisper—language agnostic out of the box.

Why is it gaining traction?

Its three-line Python API and full CLI (enroll, identify, transcribe, manage profiles) beat raw pyannote or WhisperX, which lack named identification or easy profile persistence. Parallel GPU/CPU processing, overlap flagging, and pluggable STT providers deliver fast results without custom pipelines. Save/load profiles make it reusable for ongoing speaker detection workflows.

Who should use this?

Audio ML devs prototyping smart speakers or voice bots on ESP32. Podcast producers auto-labeling host/guest segments. Call center analysts distinguishing agents from customers, or forensic teams verifying speakers in tapes with transcription.

Verdict

Worth trying for quick speaker identification Python setups—excellent docs, typed results, and MIT license lower the barrier despite 15 stars and 1.0% credibility score signaling early alpha maturity. Test on your data; pair with production monitoring.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.