Soul-AILab

Official inference code for SoulX-LiveAct: Towards Hour-Scale Real-Time Human Animation with Neighbor Forcing and ConvKV Memory

45
2
100% credibility
Found Mar 20, 2026 at 44 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

SoulX-LiveAct generates lifelike, real-time human animation videos from a reference image and audio input using advanced AI techniques.

How It Works

1
🔍 Discover LiveAct

You stumble upon a fun tool that turns a single photo and voice recording into a lively talking video.

2
📦 Get the app

Download the simple program and ready-made brains for animation with a few clicks.

3
🖼️🎤 Add your photo and voice

Upload a picture of someone special and their audio clip – like a podcast or song.

4
▶️ Watch it come alive

Press play and see the person move, lip-sync, and emote in real-time just like they're really talking.

🎥✨ Your video is magic!

Save your smooth, realistic animated clip to share with friends or use in videos.

Sign up to see the full architecture

3 more

Sign Up Free

Star Growth

See how this repo grew from 44 to 45 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is SoulX-LiveAct?

SoulX-LiveAct delivers official inference code for generating lifelike, hour-scale real-time human animation videos from a reference image, streaming audio, and text prompts. It powers multimodal talking-head videos synced to speech or music, hitting 20 FPS on dual H100 GPUs or 6 FPS on a single RTX 5090 via Python and PyTorch. Developers run CLI batch jobs with `generate.py` or GUI streaming demos via `demo.py` for podcasts and FaceTime-style interactions.

Why is it gaining traction?

It crushes memory limits with ConvKV compression for endless videos at constant VRAM, plus neighbor forcing for seamless real-time diffusion. FP8 kernels, block offloading, and distributed runs make consumer GPUs viable without quality drops. The quick-start scripts and Hugging Face model weights hook experimenters chasing animation inference speedups.

Who should use this?

AI researchers building audio-driven avatars, streaming devs prototyping live video synthesis, or content creators automating lip-sync from podcasts and talk shows.

Verdict

Solid starter for real-time human animation inference on modest hardware, but 45 stars and 1.0% credibility score signal early maturity—docs shine, yet test edge cases before deploying. Worth forking if hour-scale forcing demos excite you.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.