/Tech4h ago

Neural MMO creator Joseph Suarez parodies recent ML optimizer research, joking that a new "PowerWash" method achieves zero-step training

The fictional algorithm evaluates random weights until validation passes.

1025946917.4K
Original post
Joseph Suarez 🐡@jsuarez#1371inTech

Hello @kellerjordan0 @_arohan_. I noticed that in your recent optimizer work, you appear to have used the inefficient versions of Muon and Shampoo that have long since been succeeded by PowerWash last week. The new algorithm is quite simple and elegant: it merely generates a set of weights with a different seed and evaluates until one of them passes the validation threshold, therefore cutting speedrun time down to 0 steps. The SplittingHairs normalization addition is particularly useful for stabilizing performance. I hope we can collaborate to bring this new standard into broader usage!

10:51 AM · Jun 10, 2026 · 14.2K Views
Sentiment

Users praise the PowerWash zero-step algorithm satire for its clear explanations of optimizer techniques along with clever humorous code details like fused weight updates.

Pos
100.0%
Neg
0.0%
3 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS3.8KLIKES42REPLIES2
rohan anil@_arohan_

I love power wash, cosigned

Hello @kellerjordan0 @_arohan_. I noticed that in your recent optimizer work, you appear to have used the inefficient versions of Muon and Shampoo that have long since been succeeded by PowerWash last week. The new algorithm is quite simple and elegant: it merely generates a set of weights with a different seed and evaluates until one of them passes the validation threshold, therefore cutting speedrun time down to 0 steps. The SplittingHairs normalization addition is particularly useful for stabilizing performance. I hope we can collaborate to bring this new standard into broader usage!

2hViews 3.8KLikes 42Bookmarks 2
BOOKMARKS3

Now that I have your attention, any suggestions on our ~200 line CUDA implementation of Muon would be greatly appreciated https://github.com/PufferAI/PufferLib/blob/4.0/src/muon.cu. In the 5.0 branch on the same file, I played with a small change to preserve LR across model sizes, but there have not been any major improvements otherwise.

4hViews 1.9KLikes 32Bookmarks 3
Lucas Nestler@Clashluke

@jsuarez i love the fused weight update and fp32 upcasts. excellent code, sir

2hViews 154Likes 1Bookmarks 1
Michal Wolski@michalwols

@jsuarez @kellerjordan0 @_arohan_ real alpha is in taking all of those hf checkpoints and learning to initialize, maybe also a better optimization basis

3hViews 674Likes 2

@jsuarez @kellerjordan0 @_arohan_ thank you for explaining this so clearly. I feel like explanations like this are remarkably rare, and I look at a lot of explanations.

3hViews 579Likes 2
高煜朗@tRpyNXsk3bDB6A7

@jsuarez @kellerjordan0 @_arohan_ serious?

3hViews 474Likes 1
Ted@TedNoNumbers

@jsuarez @kellerjordan0 @_arohan_ what is this PowerWash? can you link it?

3hViews 488
radah@the_radah

@jsuarez @kellerjordan0 @_arohan_ Bruh

2hViews 240
Lunari@0x_lun

@_arohan_ power washing is one of those things that scratches a brain itch nobody knew they had until they tried it

2hViews 113
Ω.KendrickPlumard@fouriergalois

@michalwols @jsuarez @kellerjordan0 @_arohan_ 🤔

2hViews 12
QL@KP199943

@jsuarez @kellerjordan0 @_arohan_ lol

2hViews 7