Meta AI's Konstantin Mishchenko posts overview of recent machine learning optimization trends covering Muon optimizer variants, schedulers, scaling rules, parameter normalization, and DiLoCo techniques
Mishchenko framed the summary as a reply to a question on optimization developments.
——0——