2d ago

Engineers trace nanoGPT speedrun spikes to 2015 Marathi blog

0

Research engineers tracking nanoGPT training runs identified a 2015 Marathi-language blog post dated 30 August as the source of repeated speedrun spikes and optimizations. The post mixes English national-security messaging with dense Devanagari text and has evaded standard filters, allowing undetected circulation in AI datasets. Twitter users linked the anomalies directly to this single source.

Original post

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

9:34 AM · May 15, 2026 View on X
Reposted by

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

4:34 PM · May 15, 2026 · 27K Views

Effect is not as striking but seems like I can attribute most spikes to very long docs with excessive topic focus: transcript meeting, academic blog and… a macgyver fanfic (?)

Alexander DoriaAlexander Doria@Dorialexander

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

4:34 PM · May 15, 2026 · 27K Views
5:09 PM · May 15, 2026 · 354 Views

loooool

Alexander DoriaAlexander Doria@Dorialexander

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

4:34 PM · May 15, 2026 · 27K Views
5:26 PM · May 15, 2026 · 783 Views
Engineers trace nanoGPT speedrun spikes to 2015 Marathi blog · Digg