/AI5h ago

SGLang And Miles Add Day-0 Support For NVIDIA Nemotron 3 Ultra

227551.2K

Original posts

Reposts

#1147

Original post

Banghua Zhu#1147

LMSYS Org@lmsysorg

🚀 New blog: SGLang and Miles Add Day-0 Support for NVIDIA Nemotron 3 Ultra for Long-Running Autonomous Agents

The hard part of agentic workloads isn't one big answer, it's sustaining reasoning across hundreds of steps. Here's how we deliver:

✅ Hybrid Mamba-Transformer MoE: Mamba keeps long context cheap, Transformer layers preserve exact recall when agents retrieve facts ✅ One NVFP4 checkpoint, two GPU generations: same weights run on Hopper & Blackwell, no requant ✅ MTP cuts multi-turn latency by predicting multiple tokens per pass ✅ Miles RL: verified GRPO pipeline on 128 H200s in colocate mode. DP attention breaks the n_groups=8 TP cap to unlock large-scale EP for a Mamba-hybrid MoE

The full write-up covers the serving setup, the GRPO pipeline, on-policy verification, and reproducible Docker + scripts.

6:26 AM · Jun 4, 2026 · 1.2K Views

/AI5h ago

SGLang And Miles Add Day-0 Support For NVIDIA Nemotron 3 Ultra

--0--

Original posts

Reposts

#1147

Original post

Banghua Zhu#1147

LMSYS Org@lmsysorg

🚀 New blog: SGLang and Miles Add Day-0 Support for NVIDIA Nemotron 3 Ultra for Long-Running Autonomous Agents

The hard part of agentic workloads isn't one big answer, it's sustaining reasoning across hundreds of steps. Here's how we deliver:

The full write-up covers the serving setup, the GRPO pipeline, on-policy verification, and reproducible Docker + scripts.

6:26 AM · Jun 4, 2026 · 1.2K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

VIEWS196LIKES1

LMSYS Org@lmsysorg

Read full blog: https://www.lmsys.org/blog/2026-06-04-nvidia-run-nemotron-3-ultra/

5h1961

Posts from X

Most Activity

No ranked X posts are available for this story yet.