bovod-sjtu / HoliTok
PublicHoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding
HoliTok is an open-source audio processing tool that compresses audio into compact mathematical representations, reconstructs audio from those representations, and extracts semantic features for audio understanding.
How It Works
You hear about HoliTok from a colleague or online—it can transform audio files into compact representations and back again.
You install HoliTok on your computer using a simple installation command, and it automatically downloads the pre-trained models.
Transform your audio into a compact mathematical representation that takes up much less space
Turn latents back into audio—useful for testing how well the compression works
Extract high-level features that describe what's in your audio, like speech content or audio characteristics
You point the tool at your audio file and let it process—everything runs automatically on your computer's graphics card.
You receive your output: either compressed latents, reconstructed audio, or semantic features ready for your next project.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.