/AI15h ago

NVIDIA Sticks to 10% Activation in Nemotron 3 via LatentMoE Research

114161.8K

Original posts

Reposts

#1352

Original post

Alex J. Champandard 🌱#1352

Bleys Goodson@bleysg

There is a backstory to why @NVIDIAAI has stuck to 10% throughout the Nemotron 3 series, including the new 550B Ultra model, while most of the industry chases MoE with 3-5% activation.

LatentMoE is that story. They argue effective MoEs be evaluated by two dimensions: accuracy per FLOP and accuracy per parameter. The race toward 3-5% activation implicitly optimizes only the first.

https://arxiv.org/abs/2601.18089v1

8:05 PM · Jun 1, 2026 · 1.8K Views

/AI15h ago

NVIDIA Sticks to 10% Activation in Nemotron 3 via LatentMoE Research

--0--

Original posts

Reposts

#1352

Original post

Alex J. Champandard 🌱#1352

Bleys Goodson@bleysg

There is a backstory to why @NVIDIAAI has stuck to 10% throughout the Nemotron 3 series, including the new 550B Ultra model, while most of the industry chases MoE with 3-5% activation.

LatentMoE is that story. They argue effective MoEs be evaluated by two dimensions: accuracy per FLOP and accuracy per parameter. The race toward 3-5% activation implicitly optimizes only the first.

https://arxiv.org/abs/2601.18089v1

8:05 PM · Jun 1, 2026 · 1.8K Views

Sentiment

Users are optimistic about NVIDIA sticking to 10% activation in Nemotron 3 MoE models because the architectures better optimize for their available GB200 hardware.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

No ranked X posts are available for this story yet.