NAVA is an AI system developed by Baidu's ERNIE team that generates synchronized audio and video content from text descriptions, with support for voice cloning, image animation, and audio-only generation.
How It Works
You hear about an AI that creates videos with perfectly synchronized sound from just a text description.
You download the project and the AI model weights with simple commands - everything you need comes in one package.
You write a simple description like 'a surfer riding a wave at sunset' or 'two people having a conversation in a coffee shop'.
You click 'Rewrite' and watch as your simple description transforms into a detailed, cinematic prompt that brings out the AI's full potential.
Upload a short voice sample and the AI will use that voice in the generated speech
Upload a starting image and the AI will animate it into a video
Generate just sound effects or speech without video
You click Generate and wait about a minute while the AI creates your video with naturally synchronized audio.
Your video plays with audio that perfectly matches the action - lip movements sync with speech, sound effects match the scene.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.