SGLang releases version 0.5.12 with a TokenSpeed MLA attention backend optimized for NVIDIA Blackwell GPUs and FP8 KV cache support
Gains target DeepSeek V3.2 and GLM-5 via GEMM optimizations.
——0——
Gains target DeepSeek V3.2 and GLM-5 via GEMM optimizations.