12h ago

NanoGPT Training Plots Reveal Convergent Clusters In Sequence Intensity

6802279.4K

——0——

Original post

Alexander Doria#897@DORIALEXANDER

not entirely sure it's how nanogpt speedrun should be done but educational.

9:02 AM · May 16, 2026

Cluster engagement

71 snapshots

ORIGINAL POST

#897Alexander Doria@DORIALEXANDER

not entirely sure it's how nanogpt speedrun should be done but educational.

4:02 PM · May 16, 2026 · 5.9K Views

QUOTE POST

#897Alexander Doria@DORIALEXANDER

actually maybe it's just time to design new challenges: *moe (!) *intently multilingual *documented/releasable data (and mode failure) *at least partial synth with reasoning traces throughout (SYNTH/Zyphra) *benchmark targets on top of loss.

Alexander Doria@Dorialexander

not entirely sure it's how nanogpt speedrun should be done but educational.

4:02 PM · May 16, 2026 · 5.9K Views

8:23 PM · May 16, 2026 · 3.2K Views

#897Alexander Doria@DORIALEXANDER

and maybe even an agentic subset (function calling definitely work in small range)

Alexander Doria@Dorialexander

8:23 PM · May 16, 2026 · 3.2K Views

8:26 PM · May 16, 2026 · 505 Views