wavtechyukky

Python reimplementation of SHIRO phoneme-to-speech alignment toolkit for Japanese singing voice

14
1
100% credibility
Found Mar 29, 2026 at 14 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Jupyter Notebook
AI Summary

pyshiro automatically aligns Japanese hiragana lyrics to singing voice audio recordings to generate precise phoneme boundary labels for annotation or synthesis.

How It Works

1
🎤 Discover pyshiro

You hear about a friendly tool that perfectly matches Japanese song lyrics to the actual singing voice.

2
📁 Gather your song and lyrics

Collect a recording of your Japanese song and write the lyrics in simple hiragana, one phrase per line with pauses.

3
✂️ Chop into short clips

Break the song into bite-sized phrases so the tool can focus easily without getting overwhelmed.

4
Magic alignment happens

Feed in the clips and lyrics, and watch as it automatically figures out exactly when each sound starts and ends.

5
✏️ Tweak by ear

Open the colorful timeline labels in a free audio editor and nudge any timings that need a gentle fix.

6
🔗 Stitch it all together

Combine the fixed pieces back into one smooth full-song label file, filling gaps with pauses.

🎉 Perfect timings ready!

Celebrate having spot-on phoneme boundaries for your singing voice project or synthesizer.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 14 to 14 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is pyshiro?

pyshiro is a Python reimplementation of the SHIRO phoneme-to-speech alignment toolkit, optimized for Japanese singing voice using Hidden Semi-Markov Models. It aligns hiragana lyrics to 16kHz audio clips, outputting boundaries in ENUNU-compatible .lab, Praat TextGrid, or Audacity formats, with built-in kana-to-phoneme conversion. Developers get CLI commands like pyshiro-align for quick inference and pyshiro-train for custom models from labeled corpora, all in a lightweight python github library.

Why is it gaining traction?

This pure-Python package skips C++ compilation hassles of the original SHIRO, running smoothly in jupyter notebooks or python github actions for singing voice workflows. User-facing perks like 2-pass alignment, skippable pauses/breaths, and triphone expansion deliver precise results on singing data where general speech tools falter, plus pre-trained models for instant japanese alignment.

Who should use this?

Vocal synthesis engineers labeling Japanese singing datasets for ENUNU or VOCALOID-style models. Music AI devs automating phoneme boundaries in karaoke apps or shiro game audio mods. Researchers prototyping singing toolkit pipelines in python github projects.

Verdict

Grab this for niche Japanese singing alignment—CLI and jupyter notebook workflow shine despite 14 stars and 1.0% credibility score. Early maturity means vet on small datasets, but it's a practical reimplementation worth python github download.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.