Day-0 support is the starting point, not the finish line. The SGLang community has always been turning frontier models into fast, production-ready serving systems as quickly as possible.
In 2 months, folks achieve 5X throughout speedup in Nvidia GB300.
Proud of the @sgl_project and @NVIDIAAI teams pushing this forward.
While SGLang provided Day-0 support for DeepSeek-V4, the collaboration between the @lmsysorg and @NVIDIAAI engineering teams has taken its production performance to the next level.
According to the public SemiAnalysis InferenceX dashboard, the GB300 disaggregated lane (DeepSeek-V4 Pro, FP4, 8K/1K) saw a 5x throughput increase—surging from ~2,200 to ~11,200 tok/s/GPU at identical interactivity levels. These updates sustain high throughput much deeper into target interactivity ranges most deployments target, while also driving a 2.9x lift on the Blackwell Ultra aggregated lane.
Find the full technical breakdown in the comments below:










