/AI5h ago

SGLang And Miles Add Day-0 Support For NVIDIA Nemotron 3 Ultra

--0--
Original posts
Reposts
Original postBanghua Zhu#1147
LMSYS Org@lmsysorg

šŸš€ New blog: SGLang and Miles Add Day-0 Support for NVIDIA Nemotron 3 Ultra for Long-Running Autonomous Agents

The hard part of agentic workloads isn't one big answer, it's sustaining reasoning across hundreds of steps. Here's how we deliver:

āœ… Hybrid Mamba-Transformer MoE: Mamba keeps long context cheap, Transformer layers preserve exact recall when agents retrieve facts āœ… One NVFP4 checkpoint, two GPU generations: same weights run on Hopper & Blackwell, no requant āœ… MTP cuts multi-turn latency by predicting multiple tokens per pass āœ… Miles RL: verified GRPO pipeline on 128 H200s in colocate mode. DP attention breaks the n_groups=8 TP cap to unlock large-scale EP for a Mamba-hybrid MoE

The full write-up covers the serving setup, the GRPO pipeline, on-policy verification, and reproducible Docker + scripts.

6:26 AM Ā· Jun 4, 2026 Ā· 1.2K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
No ranked X posts are available for this story yet.