/AI5h ago

CoreAutoAI co-founder Rohan Anil argues ML researchers focus on incremental optimization variants instead of questioning core formulations

Lucas Beyer noted a similar pattern with ResNets and ViTs.

2319675414.2K
Original post
rohan anil@_arohan_#79inAI

I don’t know what the phenomena is called:

Sometimes the field mines improvements near a local neighborhood.

Like Adam -> (badam, dadam, madam), Shampoo -> Muon -> (Duon, Buon, Luon), last few made up instead of questioning whether the original formulation itself is the right question. You get so much math explaining these variants bordering slop. Same happened with Transformers too.

Mathematically sophisticated but solving the wrong problem.

10:44 AM · Jun 8, 2026 · 10.7K Views
Sentiment

Many users dismiss local variants of Adam and Shampoo as unoriginal spam and noise produced by groupthink and publish-or-perish incentives instead of addressing base assumptions.

Pos
0.0%
Neg
100.0%
7 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS3.3KBOOKMARKS3LIKES40RETWEETS3REPLIES3
rohan anil@_arohan_

Llms have made it easy to now spam methodological improvements that are largely bordering noise.

rohan anil@_arohan_

I don’t know what the phenomena is called:

Sometimes the field mines improvements near a local neighborhood.

Like Adam -> (badam, dadam, madam), Shampoo -> Muon -> (Duon, Buon, Luon), last few made up instead of questioning whether the original formulation itself is the right question. You get so much math explaining these variants bordering slop. Same happened with Transformers too.

Mathematically sophisticated but solving the wrong problem.

5hViews 3.3KLikes 40Bookmarks 3

@_arohan_ I don't think it's new or llm related. We had the same with ResNets and later with ViTs, and those were before llms. It's just the easy research to do.

rohan anil@_arohan_

Llms have made it easy to now spam methodological improvements that are largely bordering noise.

2hViews 632Likes 9Bookmarks 0
Lucas Nestler@Clashluke

@_arohan_ it’s difficult seeing [outside of] your box

rohan anil@_arohan_

I don’t know what the phenomena is called:

Sometimes the field mines improvements near a local neighborhood.

Like Adam -> (badam, dadam, madam), Shampoo -> Muon -> (Duon, Buon, Luon), last few made up instead of questioning whether the original formulation itself is the right question. You get so much math explaining these variants bordering slop. Same happened with Transformers too.

Mathematically sophisticated but solving the wrong problem.

2hViews 300Likes 3Bookmarks 0
ueaj@_ueaj

@_arohan_ Thoughts on ademamix? I think it kinda missed the multiscale inductive bias but it was very close and very early

5hViews 411Likes 2
Sachin@sachdh

@_arohan_ GRPO variants from last year will say hi

5hViews 277

@_arohan_ @nathancgy4 Oh man, I think I have a really good explanation for this, but it's a bit longer than a tweet. I might have to blog about this: I think it's because all components of neural network training have to work together, so it's hard to do non-local improvements.

5hViews 248
Alex YGift@Radipdegen

@_arohan_ Wait, I think youre suggesting all those extensions are just post-hoc rationalizations pasted onto an original breakthrough?

5hViews 106
Rugbist@rugbist_

@_arohan_ feels like naming satire just keeps becoming realer over time

not sure if we need new names or just to sit with the original ones longer

5hViews 41
ashu@pizzacritic999

@_arohan_ "badam" (almond)

4hViews 38
Invincible@InvincibleEdge

@_arohan_ real recognize real discovering meta-methods while ignoring base assumptions is how the loop keeps running

5hViews 33
Lavan@ponylavan

@_arohan_ groupthink + publish or perish

5hViews 28
M@init_malachi

@_arohan_ yes so dissatisfying but apparently everything else is market irrelevant

3hViews 24
Zack Fitch@Jzfitch1

@_arohan_ People show up for hacks, not first principles.

3hViews 24
Blissy@BlissyOnX

@_arohan_ greedy optimization is called local maxima. but honest question - is there a taxonomy for the "just keep changing letters" phase?

5hViews 21
jaisel@jaiselsingh

i sometimes wonder if we're just doing a local search over research programs. once the abstraction is fixed, you get high-sophistication perturbations: Adam→variants, Shampoo→variants, Transformer→variants. i wonder if it's even optimizing inside the right model class most of the time

5hViews 14
Logan Ford@lhford0

@_arohan_ anyone who has followed AI research closely knows that AI slop has been around a lot longer than LLMs

2hViews 6
tiplur-bilrex@tiplur_bilrex

@_arohan_ Some relevant advice from Hamming:

5hViews 2