Meet LongCat-Video-Avatar 1.5๐ฑโour upgraded, open-source digital human framework.
Built for real production, not just short demos.
What's New:
๐น Upgraded Audio Encoder: Replaces Wav2Vec2 with Whisper-Large, yielding significantly smoother and more natural lip dynamics.
๐น Production-Ready Stability: Achieves accurate lip-synchronization, full-body temporal stability, and robust long-video generation with strict identity consistency.
๐น Stylized Domain Generalization: Robustly generalizes to anime, animals, and complex real-world conditions such as multi-person interactions and object handling.
๐น Efficient 8-Step Inference: Advanced step distillation accelerates inference to 8 NFE, balancing cost-effective serving with exceptional visual fidelity.
๐ LongCat-Video-Avatar 1.5 performs strongly in realism, naturalness, and stability, outperforming leading open-source models and closed systems.
๐ฑ Avatar 1.5 framework is now open source:
๐ Weights & Code:https://github.com/meituan-longcat/LongCat-Video
๐ HuggingFace: https://huggingface.co/meituan-longcat/LongCat-Video-Avatar-1.5
๐ Tech Report: https://github.com/meituan-longcat/LongCat-Video/blob/main/assets/LongCat-Video-Avatar-1.5-Tech-Report.pdf
๐ Project Page: https://meigen-ai.github.io/LongCat-Video-Avatar-1.5-Page/