19h ago

Substack analysis maps 1236 n-grams in Granta prize-winning story and traces 437 rare phrases plus 115 unique segments to Archive of Our Own fanfiction

Interactive tool shows story assembled from memorized online passages.

0
Original post

LLMs are not conscious. They do not have a perfect sense of embodiment. They are autoregressive models that generate text by sampling, more or less, from a very large pile of things other people wrote. More details in this essay on Substack 👇 https://tuhinchakrabarty.substack.com/p/ai-slop-grantagate-and-bad-writing

7:22 AM · May 22, 2026 View on X
Reposted by

Paging Alan Sokol.

Alex ImasAlex Imas@alexolegimas

This from @TuhinChakr is brilliant. That prize winning story from Granta? Turns out it's just a bunch of random whole phrases taken directly from existing text on the internet. Tool allows you to trace those n-grams directly to their source, which is mostly random fanfiction. https://tuhinchakrabarty.substack.com/p/ai-slop-grantagate-and-bad-writing

8:09 PM · May 22, 2026 · 170.3K Views
12:59 AM · May 23, 2026 · 33K Views

Since this post has blown up.

1) The research is based on two papers

https://arxiv.org/pdf/2410.04265 https://arxiv.org/pdf/2504.07096

2) When writing about the matches I focused on webpages that are not defunct and fan fiction results were especially relevant to AI fiction but some phrases can be in other websites too. That does not change the point about genre mismatch or stitching rare expressions

3) The attribution engine is built using CommonCrawl that LLMs have been trained on. So it might not catch all the possible webpages that might have that expression

Alex ImasAlex Imas@alexolegimas

This from @TuhinChakr is brilliant. That prize winning story from Granta? Turns out it's just a bunch of random whole phrases taken directly from existing text on the internet. Tool allows you to trace those n-grams directly to their source, which is mostly random fanfiction. https://tuhinchakrabarty.substack.com/p/ai-slop-grantagate-and-bad-writing

8:09 PM · May 22, 2026 · 170.3K Views
12:02 AM · May 23, 2026 · 2.6K Views

@alexolegimas Thank you 🥹

Alex ImasAlex Imas@alexolegimas

This from @TuhinChakr is brilliant. That prize winning story from Granta? Turns out it's just a bunch of random whole phrases taken directly from existing text on the internet. Tool allows you to trace those n-grams directly to their source, which is mostly random fanfiction. https://tuhinchakrabarty.substack.com/p/ai-slop-grantagate-and-bad-writing

8:09 PM · May 22, 2026 · 170.3K Views
8:15 PM · May 22, 2026 · 5.8K Views

This from @TuhinChakr is brilliant. That prize winning story from Granta? Turns out it's just a bunch of random whole phrases taken directly from existing text on the internet. Tool allows you to trace those n-grams directly to their source, which is mostly random fanfiction.

substack.com
/p/ai-slop-grantagate-and-bad-writing
8:09 PM · May 22, 2026 · 170.3K Views
Substack analysis maps 1236 n-grams in Granta prize-winning story and traces 437 rare phrases plus 115 unique segments to Archive of Our Own fanfiction · Digg