kizuna-intelligence / Irodori-TTS-Lite
PublicInt4-quantized inference runtime for Faster-Irodori-TTS2 voice-design DiT. ~1GB VRAM end-to-end.
Irodori-TTS-Lite is a lightweight runtime that lets you run a Japanese text-to-speech model using only 1 GB of GPU memory instead of nearly 3 GB. It works by using special compressed model files that are 85% smaller on disk, while maintaining near-identical audio quality. You install it with one command, and it automatically hooks into your existing TTS system so you can type text and hear natural-sounding Japanese speech. The project includes tools to measure exactly how much memory your setup uses, and you can choose between faster performance or even lower memory usage depending on your hardware.
How It Works
You discovered that the original Japanese TTS model needs almost 3 GB of GPU memory, which is too much for your setup.
You install Irodori-TTS-Lite, which automatically downloads the compressed model weights from the internet.
With one simple command, your TTS system learns to read the compressed model files directly.
You enter Japanese text like 'ใใใซใกใฏใใกใฉใ ใใ' and the system converts it to speech.
Keep everything on GPU for ~1 second generation time
Move the codec to CPU, using only ~1 GB of GPU memory total
The generated speech is saved as a WAV file, sounding nearly identical to the original uncompressed version.
Your voice project now runs on just 1 GB of GPU memory instead of nearly 3 GB, with almost no loss in quality.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.