inclusionAI / Ming-omni-tts
PublicMing-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control
Ming-omni-tts is a unified AI model that generates controllable speech, music, and sound effects from text prompts, supporting voice cloning, emotions, dialects, and text normalization.
How It Works
You hear about a fun tool that turns your words into lifelike speech, music, or sounds with custom voices and feelings.
Head to the demo page to type a phrase and instantly hear it spoken in different voices or with background music.
Choose a built-in voice, describe a new one like 'cheerful grandma', or upload a short clip of someone's voice to clone it.
Write your text, add simple instructions like 'speak slowly with joy' or 'add rain sounds in the background'.
Hit play and watch as realistic speech, tunes, or effects come alive exactly how you imagined.
Download your custom audio clip to use in stories, videos, or podcasts, delighting friends and family.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.