🚀 LongLive 2.0 just got faster!
Since last week’s release, we further optimized the NVFP4 inference path and improved the overall throughput by 18.6%.
🔥Now, generating a 64s video takes only 30.6s end-to-end, including VAE decoding.
⚡⚡That’s over 2× real-time generation.
🛠️ What changed under the hood?
• Fused Triton RoPE / adaLN kernels
• Reduced KV-cache synchronization overhead
• In-place quantized KV-cache updates
• Faster FP4 KV dequantization
• Pinned VAE transfers
• Safer LoRA-before-quantization setup
🎬 LongLive 2.0 is our open-source 4-bit long-video generation infra for both training and inference.
🚀 We are continuing to push long-video generation toward faster, lighter, and more practical deployment.
🔗 Code: https://github.com/NVlabs/LongLive
#LongVideoGeneration #VideoGeneration #Realtime #AIInfra #EfficientAI #FP4 #Parallel #NVIDIA