19h ago

Meta AI's Konstantin Mishchenko posts overview of recent machine learning optimization trends covering Muon optimizer variants, schedulers, scaling rules, parameter normalization, and DiLoCo techniques

Mishchenko framed the summary as a reply to a question on optimization developments.

0
Original post

I was responding to a question about what's been happening in optimization and realized someone else might find it useful, so here it is, my biased(!) perspective with links.

12:23 PM · May 22, 2026 View on X
Reposted by
Meta AI's Konstantin Mishchenko posts overview of recent machine learning optimization trends covering Muon optimizer variants, schedulers, scaling rules, parameter normalization, and DiLoCo techniques · Digg