12h ago

NanoGPT Training Plots Reveal Convergent Clusters In Sequence Intensity

0
Original post

not entirely sure it's how nanogpt speedrun should be done but educational.

9:02 AM · May 16, 2026 View on X

actually maybe it's just time to design new challenges: *moe (!) *intently multilingual *documented/releasable data (and mode failure) *at least partial synth with reasoning traces throughout (SYNTH/Zyphra) *benchmark targets on top of loss.

Alexander DoriaAlexander Doria@Dorialexander

not entirely sure it's how nanogpt speedrun should be done but educational.

4:02 PM · May 16, 2026 · 5.9K Views
8:23 PM · May 16, 2026 · 3.2K Views

and maybe even an agentic subset (function calling definitely work in small range)

Alexander DoriaAlexander Doria@Dorialexander

actually maybe it's just time to design new challenges: *moe (!) *intently multilingual *documented/releasable data (and mode failure) *at least partial synth with reasoning traces throughout (SYNTH/Zyphra) *benchmark targets on top of loss.

8:23 PM · May 16, 2026 · 3.2K Views
8:26 PM · May 16, 2026 · 505 Views
NanoGPT Training Plots Reveal Convergent Clusters In Sequence Intensity · Digg