Gr122lyBr / voicetag

Public

Speaker identification powered by pyannote and resemblyzer

audio-transcription deep-learning deepgram diarization groq

100% credibility

Found Mar 17, 2026 at 15 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

voicetag is a Python library for automatically identifying speakers in audio files and optionally transcribing their speech with timestamps.

How It Works

🔍 Discover voicetag

You find voicetag, a handy tool that figures out who is speaking in your audio recordings like meetings, podcasts, or calls.

📥 Set it up

You easily add voicetag to your computer so it's ready to use.

✅ Get free access

You visit a website to agree to simple terms and grab a free pass to unlock the voice listening features.

👥 Teach voices

You name people and share a few short clips of their voices so voicetag recognizes them later.

Pick your goal

👂

Spot speakers

Find out exactly when each person talked and how sure it is.

💬

Get full words

Turn speech into text, labeled by who said it.

▶️ Run on recording

You select your audio file, and voicetag quickly processes it to label the speakers.

✅ Enjoy clear results

You get a neat timeline showing who spoke when, with overlaps noted, and transcripts if you chose, making your audio easy to follow.

Sign up to see the full architecture

5 more

Star Growth

See how this repo grew from 15 to 15 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is voicetag?

voicetag is a Python library for speaker identification and diarization in audio files, answering "who said what, when" in meetings, podcasts, or calls. Enroll known speakers with a few samples via a simple API or CLI, then process any recording to get timed segments with names, unknowns, or overlaps. It integrates pyannote for segmentation, resemblyzer for voice matching, and optional transcription via OpenAI, Groq, Deepgram, Fireworks, or local Whisper—language agnostic out of the box.

Why is it gaining traction?

Its three-line Python API and full CLI (enroll, identify, transcribe, manage profiles) beat raw pyannote or WhisperX, which lack named identification or easy profile persistence. Parallel GPU/CPU processing, overlap flagging, and pluggable STT providers deliver fast results without custom pipelines. Save/load profiles make it reusable for ongoing speaker detection workflows.

Who should use this?

Audio ML devs prototyping smart speakers or voice bots on ESP32. Podcast producers auto-labeling host/guest segments. Call center analysts distinguishing agents from customers, or forensic teams verifying speakers in tapes with transcription.

Verdict

Worth trying for quick speaker identification Python setups—excellent docs, typed results, and MIT license lower the barrier despite 15 stars and 1.0% credibility score signaling early alpha maturity. Test on your data; pair with production monitoring.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

Followers

Base stars: 15 stars

Penalty: Very new repo (0d): -70%

Bonus: AI verified quality (100%)

Account age: 2,184 days

Repo age: 0 days

License: MIT

Updated: Mar 17, 2026