1d ago

NVIDIA pretrains Nemotron 3 models in 4-bit NVFP4

3284081370113.1K

——0——

NVIDIA has pretrained Nemotron 3 Super and Nemotron 3 Ultra models entirely in 4-bit NVFP4 precision. Nemotron 3 Super contains 120 billion parameters and was trained on 25 trillion tokens. Nemotron 3 Ultra reaches roughly 500 billion parameters on the same token volume. Vice president Bryan Catanzaro described the full pretraining run in reduced precision as part of efforts to raise training efficiency at scale.

Original post

Bryan Catanzaro#438@CTNZR

We've gone even farther: Nemotron 3 Super is 120B and pretrained on 25T tokens in NVFP4. Nemotron 3 Ultra is ~500B and also pretrained in NVFP4. Accelerated computing means we rethink every aspect of the AI stack looking for new opportunities to improve efficiency.

2:01 PM · May 15, 2026

Cluster engagement

99 snapshots

QUOTE POST

#438Bryan Catanzaro@CTNZR

We've gone even farther: Nemotron 3 Super is 120B and pretrained on 25T tokens in NVFP4. Nemotron 3 Ultra is ~500B and also pretrained in NVFP4.

Accelerated computing means we rethink every aspect of the AI stack looking for new opportunities to improve efficiency.

9:01 PM · May 15, 2026 · 112.6K Views

#1117Shital Shah@SYTELUS

@ctnzr @max_paperclips This is a great work. Is it possible to give wall clock time for training dor 4 vs 8 bit training?

Bryan Catanzaro@ctnzr

9:01 PM · May 15, 2026 · 112.6K Views

4:37 AM · May 16, 2026 · 263 Views

#1929Daniel Jeffries@DAN_JEFFRIES1

@ctnzr @max_paperclips Now give us a 10T parameter frontier model we can run on a small cluster at home, Bryan! The world is counting on you!

Bryan Catanzaro@ctnzr

9:01 PM · May 15, 2026 · 112.6K Views

11:26 AM · May 16, 2026 · 295 Views