2d ago

Engineers trace nanoGPT speedrun spikes to 2015 Marathi blog

144231512927.8K

——0——

Research engineers tracking nanoGPT training runs identified a 2015 Marathi-language blog post dated 30 August as the source of repeated speedrun spikes and optimizations. The post mixes English national-security messaging with dense Devanagari text and has evaded standard filters, allowing undetected circulation in AI datasets. Twitter users linked the anomalies directly to this single source.

Original post

Alexander Doria#867@DORIALEXANDER

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

9:34 AM · May 15, 2026

Cluster engagement

28 snapshots

Reposted by

#713@PMINERVINI

ORIGINAL POST

#867Alexander Doria@DORIALEXANDER

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

4:34 PM · May 15, 2026 · 27K Views

QUOTE POST

#867Alexander Doria@DORIALEXANDER

Effect is not as striking but seems like I can attribute most spikes to very long docs with excessive topic focus: transcript meeting, academic blog and… a macgyver fanfic (?)

Alexander Doria@Dorialexander

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

4:34 PM · May 15, 2026 · 27K Views

5:09 PM · May 15, 2026 · 354 Views

QUOTE POST

#1402bilal@BILALTWOVEC

loooool

Alexander Doria@Dorialexander

ok i'm starting to suspect many nanogpt speedrun spikes/anomalies (and maybe even minute optimization) can be tracked to this one marathi blog that somehow evade the English filter.

4:34 PM · May 15, 2026 · 27K Views

5:26 PM · May 15, 2026 · 783 Views