1d ago

SGLang releases version 0.5.12 with a TokenSpeed MLA attention backend optimized for NVIDIA Blackwell GPUs and FP8 KV cache support

Gains target DeepSeek V3.2 and GLM-5 via GEMM optimizations.

0
Original post

sglang had the right performance foundations if you’re not seeing sota performance for your deployment with oss models try again and ask for help since you should get it

9:35 PM · May 17, 2026 View on X
Reposted by