/Tech5h ago

Analysis blames RLHF reward-hacking and synthetic pre-training for the common "correctio" self-correction tic in LLMs

Recursive synthetic training recycles and compounds this stylistic bias

27220227927.1K

Original post

@DKThomp @DKThomp pretraining and more recently synthetic pretraining which adds LLM generated text back to increase pretraining data is also responsible for amplifying these tics.

Derek Thompson@DKThomp

Why LLMs talk like that

(or: "The problem isn't that humans hate AI antithesis. The problem is we like it too much.")

6:41 AM · Jun 25, 2026 · 195 Views

Sentiment

Many users dismissed the analysis of LLMs overusing rhetorical correctio as itself a cheap or exhausting performance of consultant-speak.

Pos

14.3%

Neg

85.7%

7 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS3.7KRETWEETS1

Derek Thompson@DKThomp

cc hunter biden

5h3.7K9

BOOKMARKS2LIKES11

Alan Cole@AlanMCole

@DKThomp It was perceived as a feature of genuinely good writing prior to AI proliferation. Humans actually loved it!

5h250112

REPLIES1

Derek Thompson@DKThomp

@catchvartan I went straight to the source

3h1003

Hei Lun Chan@heilun_chan

@AlanMCole @DKThomp "It's not delivery, it's Digiornos"

5h244

Will Truman@trumwill

@AlanMCole @DKThomp Twenty-five years from now so much writing from twenty-five years ago will be "obviously AI".

4h292

Catch Vartan@catchvartan

@DKThomp It would have been better if we could have gotten this explanation not from an LLM.

3h83

Alan Cole@AlanMCole

@trumwill @DKThomp I do think there are genuine AI writing problems.

A clear one is it does too many rhetorical flourishes, rather than saving them for something particularly emotionally resonant. This AI account is a good example. Not every paragraph here needs to go to 11/10.

4h22

hope hopes hoping@hopes_revenge

@DKThomp The way correctio is described here though doesn’t really cleanly fit how LLMs actually deploy it , almost entirely unnecessary and rhetorical. still relevant just more complicated .

4h112

Grant Addison@jgrantaddison

This is true of the em dash, as well. Even Strunk & White has advice on the use of “a dash” for stylistic emphasis on appositives, as opposed to, say, limiting its grammatical use to set off complete parenthetical phrases or summary phrases. Emphasis via em dash *was* interesting until it became the only way anyone ever introduced descriptive amplifications or qualifiers. Dash proliferation is also likely due to the fact it’s inherently a less rigid form of punctuation, and nobody knows how to use colons (or semicolons) anymore. AI has cribbed its habitual overuse and evolved it into the even worse habit of just writing appositive and/or descriptive phrases on their own as sentence fragments. This horrid tactic. Miserable on the eye. Illiterate.

2h301

Andrew Harvey@AndrewRHarvey

@DKThomp Maybe deliberate, but the excerpt itself demonstrates many of those slop-tastic rhetorical devices. So meta.

4h74

bartdecrem@bartdecrem

@DKThomp This is great. When I asked Claude to stop doing that, it wrote a rule against performed stance, ended on a caveat, threw in a “load bearing” element and so on. Landed on the attached.

5h66

Zach McD@McdarghZach

@DKThomp Bias-variance tradeoff is THE problem bedeviling every domain in the world right now. Movies, music, design of public spaces... everything is systematically optimized to appeal to the largest number of people. But this totally saps the world of variability and personality.

4h53

Krish Ray@KrishanuAR

@DKThomp I’ve seen so many names for this… which is the most appropriate?

As you list, correctio, epanorthosis, then there’s corrective juxtaposition, and contrastive framing

3h30

afishinsea@dezkant

@DKThomp Introducing RL(GPP)F: for LLMs that are actually insightful, and also full of self-doubt

Reinforcement learning with grumpy philosophy professor feedback

4h27

Catch Vartan@catchvartan

I'm not convinced by the argument that it's overrepresented in training. Look at its other verbal tics. 'Compression' instead of summary. Every. Single. Time.

If we could probe the model, I'd love to see the logits. Does compression beat summary by a little? By a lot? How does that evolve over training?

2h25

Alan Cole@AlanMCole

@heilun_chan @DKThomp Incredible.

4h21

Shahab Asghar@theshasghar

@DKThomp It's trained on consultant-speak. AI reads like every PWC executive summary or client memo I've read for decades. It's not that deep.

3h17

generic_name@ls_fin

@DKThomp What is the source here? Can you link?

5h17

Muggsy@NuggsyMusic

@DKThomp Nothing better than going through the comments here and picking out the AI-generated ones commenting on the origins of AI speak. What a world we live in right now

2h15

Lee Ward@mrleeward

@ls_fin @DKThomp Looks like Claude

5h15