/AI2h ago

AI2's Nathan Lambert says Nvidia's multi-teacher on-policy distillation for Nemotron 3 Ultra is the post-training industry standard

The pipeline uses over 10 specialized teacher models.

122391313514.6K

Quote posts

#64

Comments

#129

Original post

Nathan Lambert@natolambert#64inAI

Nvidia joined the multi-teacher, on-policy distillation (MODP) gang! Is industry standard post-training right now.

The multi-teacher SFT to RL that Microsoft did in their first model was the standard established by DeepSeek R1. I expect MAI 2 to be MODP.

6:36 AM · Jun 4, 2026 · 14.3K Views

/AI2h ago

AI2's Nathan Lambert says Nvidia's multi-teacher on-policy distillation for Nemotron 3 Ultra is the post-training industry standard

The pipeline uses over 10 specialized teacher models.

--0--

Quote posts

#64

Comments

#129

Original post

Nathan Lambert@natolambert#64inAI

Nvidia joined the multi-teacher, on-policy distillation (MODP) gang! Is industry standard post-training right now.

The multi-teacher SFT to RL that Microsoft did in their first model was the standard established by DeepSeek R1. I expect MAI 2 to be MODP.

6:36 AM · Jun 4, 2026 · 14.3K Views

Sentiment

Positive users praise Nvidia's multi-teacher distillation redesign for Nemotron 3 Ultra post-training as an interesting technical step forward, while negative users dismiss the benchmarks as unremarkable compared to 2021 results.

Pos

75.0%

Neg

25.0%

4 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

VIEWS3.6KBOOKMARKS25LIKES42RETWEETS2REPLIES5

finbarr@finbarrtimbers

The Nemotron 3 Ultra post-training pipeline is verrrry impressive.

1h3.6K4225