/AI11h ago

Researchers Introduce Muown Row-Norm Technique For Modded-NanoGPT Training

--0--
Original posts
Comments
Original post
Keller Jordan@kellerjordan0#424inAI

Here's a thread of six academic (non-world-record) Modded-NanoGPT optimization results from the past few weeks.

Result #23: Kai Lion and Florian Hübler have contributed a 3075-step run using a row-norm control technique called Muown. This result is notable for its simplicity.

6:36 PM · May 31, 2026 · 7.1K Views
Sentiment
Sentiment unavailable for this story.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
VIEWS1KBOOKMARKS3LIKES23RETWEETS2REPLIES1
Keller Jordan@kellerjordan0

Result #24: @zhiweixux showed that the 2026/05/01 record could be improved by 50 steps by separately tuning the WSD cooldown fractions for the hidden matrix vs. other parameters. Note: This can't be applied to the current record because it uses a power lr schedule.

Keller Jordan@kellerjordan0

Here's a thread of six academic (non-world-record) Modded-NanoGPT optimization results from the past few weeks.

Result #23: Kai Lion and Florian Hübler have contributed a 3075-step run using a row-norm control technique called Muown. This result is notable for its simplicity.

11hViews 1KLikes 23Bookmarks 3