18h ago

LongLive 2.0 Achieves 2x Real-Time Video Generation With NVFP4 Optimizations

1378192.4K

——0——

Original post

🚀 LongLive 2.0 just got faster! Since last week’s release, we further optimized the NVFP4 inference path and improved the overall throughput by 18.6%. 🔥Now, generating a 64s video takes only 30.6s end-to-end, including VAE decoding. ⚡⚡That’s over 2× real-time generation. 🛠️ What changed under the hood? • Fused Triton RoPE / adaLN kernels • Reduced KV-cache synchronization overhead • In-place quantized KV-cache updates • Faster FP4 KV dequantization • Pinned VAE transfers • Safer LoRA-before-quantization setup 🎬 LongLive 2.0 is our open-source 4-bit long-video generation infra for both training and inference. 🚀 We are continuing to push long-video generation toward faster, lighter, and more practical deployment. 🔗 Code: https://github.com/NVlabs/LongLive #LongVideoGeneration #VideoGeneration #Realtime #AIInfra #EfficientAI #FP4 #Parallel #NVIDIA

8:45 AM · May 25, 2026