/Tech7h ago

xAI co-founder Guodong Zhang highlights 2019 paper showing alternative optimizers fail to outperform SGD with momentum in noise-dominated regimes

Exceeding critical batch sizes limits scaling gains for alternative algorithms.

718855419.2K
Original post
Guodong Zhang@Guodzh#515inTech

spent half of my PhD working on optimization research, I only published one negative result paper showing beating SGD with momentum is hard especially when noise dominated regime 🤣

2:43 PM · Jun 10, 2026 · 14.8K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS4.9KBOOKMARKS24LIKES21RETWEETS1
You Jiacheng@YouJiacheng

@Guodzh this one right? https://arxiv.org/abs/1907.04164

spent half of my PhD working on optimization research, I only published one negative result paper showing beating SGD with momentum is hard especially when noise dominated regime 🤣

7hViews 4.9KLikes 21Bookmarks 24

and it ended up in 2022/2023 many OAI/DM ppl told me they learnt most things about neural network training from that paper

7hViews 440Likes 7Bookmarks 2
Zachary Nado@zacharynado

@Guodzh the OG optimizers crew 💙

spent half of my PhD working on optimization research, I only published one negative result paper showing beating SGD with momentum is hard especially when noise dominated regime 🤣

1hViews 170Likes 1Bookmarks 0
Ethan TS. Liu@ethantsliu

@Guodzh 😭 how ml theory usually goes

6hViews 25
xAI co-founder Guodong Zhang highlights 2019 paper showing alternative optimizers fail to outperform SGD with momentum in noise-dominated regimes · Digg