/AI8h ago

Simo Ryu argues Claude's ability to code advanced optimizers like Muon and Shampoo hints at alternative internal training methods

Academic Dan Roy highlighted the model's mathematical utility.

2221333119K
Original post
Simo Ryu@cloneofsimo#601inAI

@_arohan_ @bilaltwovec It means they are using something else?

rohan anil@_arohan_

Claude Fable let me implement Muon, Shampoo and K-FAC, what does this mean?

12:24 AM · Jun 10, 2026 · 534 Views
Sentiment

Some users offered to collaborate on sparse autoencoders with Muon and Adam, while many others suspected Claude of silent sabotage, undisclosed modifications, or non-frontier implementations.

Pos
14.3%
Neg
85.7%
7 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS1.2KLIKES3
Dan Roy@roydanroy

@_arohan_ Haha.

rohan anil@_arohan_

Claude Fable let me implement Muon, Shampoo and K-FAC, what does this mean?

2hViews 1.2KLikes 3Bookmarks 0
BOOKMARKS1
K@k0x3k

@_arohan_ Does that mean it’s not frontier 😒🥲?

9hViews 651Bookmarks 1
RETWEETS3
rohan anil@_arohan_

Claude Fable let me implement Muon, Shampoo and K-FAC, what does this mean?

11hViews 17.7KLikes 209Bookmarks 31
ueaj@_ueaj

@_arohan_

10hViews 867Likes 3
Adam Mainz@MainzOnX

@_arohan_ Oof your kernel probably could have just been a PyTorch aten op call

9hViews 683Likes 2
Olcan@olcan

@_arohan_ maybe their detector failed, maybe those things are not at the frontier, maybe it silently sabotaged your implementations

10hViews 706Likes 1
Frosty40@FrostForger

actively sabotaging research. I cant man... your muon shamploo super soaker sodapop works when a major company declares war on its own customers in the form of subtle sabotage. also we know, your optimizer is the best. anyone arguing is still at whatever lab they chose, and not at CA, so yours will be the best. but we got some serious problems out here on the frontlines. I fucking switched careers(lol i dont have a career in ai), and study everyday to know about my future, and be a father who can lead his kids safely into an unknown future. Fuck the money man, this is so much bigger than money now.

10hViews 1.2K

@_arohan_ i am writing up something on using sparse auto encoders within Muon and Adam, would be keen for you to take a look :)

10hViews 962
immortal@immortaldip

@_arohan_ > LLM research question detected > PEFT gets loaded > PEFT decide to tell Joke Muon, Shampoo, and K-FAC walk into a training run.

Adam looks over and says, “Great… the second-order optimizer support group is here to tell me I’m just momentum with marketing.”

11hViews 704
Reyaa@snr_boost

@_arohan_ It let you do particle physics, chemistry and advanced math?

10hViews 444
Santosh Mohan@theycallmeMohan

@_arohan_ Doesn’t it silently degrade performance in this case instead of flagging it explicitly to the user?

2hViews 89Likes 1
Kaizhao Liang@KyleLiang5

@_arohan_ Try AdamW 😂 see if it asks you “who is Adam”

2hViews 71Likes 1
Mayz@lunan_ai

@_arohan_ that list reads like someone just dumped their gradient descent notebook

which one should i be most worried about understanding?

9hViews 187
ar0cket1@ar0cket1

@_arohan_ its modified without notice, so no surprise

8hViews 147
Gauri Tripathi@Gauri_the_great

@_arohan_ something is fishy

6hViews 131
T Ay.@ayedtay

@_arohan_

2hViews 9
Blissy@BlissyOnX

@_arohan_ ngl that reads like a chaos scroll through optimization research

u forking the llm mafia or what

11hViews 2
Invincible@InvincibleEdge

@_arohan_ means u let a vibe cod ER build logic brick piles until architecture calc bc like 101 way back oh what they know early era 47 is there|they learning heavy optimization via formal gradient align align

11h
Load more posts