
An expressive avatar model designed for audio-driven character animation, converting audio, text, an
LongCat Avatar is an expressive avatar model built upon LongCat-Video, designed for audio-driven character animation. It automatically converts audio + text + images into super-realistic, lip-synchronized long videos with natural motion and consistent identity. The model supports multi-input types, including audio + text and photo + audio workflows, and delivers high-definition output up to 720p.
No comments yet. Start the conversation!