/AI15h ago

NVIDIA Sticks to 10% Activation in Nemotron 3 via LatentMoE Research

--0--
Original posts
Reposts

There is a backstory to why @NVIDIAAI has stuck to 10% throughout the Nemotron 3 series, including the new 550B Ultra model, while most of the industry chases MoE with 3-5% activation.

LatentMoE is that story. They argue effective MoEs be evaluated by two dimensions: accuracy per FLOP and accuracy per parameter. The race toward 3-5% activation implicitly optimizes only the first.

https://arxiv.org/abs/2601.18089v1

8:05 PM 路 Jun 1, 2026 路 1.8K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
No ranked X posts are available for this story yet.