/Tech2d ago

University of Maryland's Yekyung Kim finds LLMs suffer from "argument collapse," generating unique arguments just 3.4% of the time

Story Overview

New analysis from University of Maryland researchers shows large language models across multiple providers converge on nearly identical main arguments in long-form debate essays, achieving unique arguments only 3.4 percent of the time compared with 65.3 percent for human writers responding to the same New York Times prompts.

115864147448231.3K
Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

8:01 AM · Jun 8, 2026 · 115.3K Views
Open Question

Diversity instructions only go so far

Even when researchers added explicit variety prompts or position guidance, models recovered just half of the distinct human arguments and sometimes produced outputs outside the observed human range.

Developer Impact

Public discourse could narrow if models dominate drafting

The study notes that repeated reliance on the same polished argument structures and hedged sub-points might shrink the variety of ideas reaching readers, though real-world editing and retrieval use remain untested.

Sentiment

Positive users praise the LLM argument collapse study for exposing how AI flattens discourse depth, while negative users accuse researchers of fabricating results to protect tuition revenue.

Pos
66.7%
Neg
33.3%
6 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS43.9KBOOKMARKS120RETWEETS42
Mohit Iyyer@MohitIyyer

Different LLMs, when asked to write an essay on the same debate prompt, converge on the same main argument far more often than humans do, a phenomenon we call "argument collapse". On ~200 debate prompts, LLM essays make a unique main argument just 3% of the time, compared to 65% for human authors.

While each LLM essay might be totally reasonable on its own, as more and more of them spread through public discourse, they flatten the range of arguments that we read. Read more 👇

Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

2dViews 43.9KLikes 221Bookmarks 120
LIKES230REPLIES33
Ethan Mollick@emollick

The Matrix idea of keeping humans as batteries is obviously weird... we would be more useful as dice.

LLMs default to very similar kinds of arguments & structure, and even different LLMs seem to collapse to similar concepts. Humans provide a lot more variation in their own work.

Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

1dViews 32.6KLikes 230Bookmarks 82

Fortunately, AI is just Noah Smith, because it is trained on Noah Smith.

Thus, Noah Smith Thought will now conquer the world without me having to do anything 🥰

Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

1dViews 16.5KLikes 41Bookmarks 11

Every LLM is just Noah Smith in computer form

Mohit Iyyer@MohitIyyer

Different LLMs, when asked to write an essay on the same debate prompt, converge on the same main argument far more often than humans do, a phenomenon we call "argument collapse". On ~200 debate prompts, LLM essays make a unique main argument just 3% of the time, compared to 65% for human authors.

While each LLM essay might be totally reasonable on its own, as more and more of them spread through public discourse, they flatten the range of arguments that we read. Read more 👇

1dViews 17.6KLikes 45Bookmarks 4

This is becoming disturbingly evident across formerly respectable publications across South Asia

Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

1dViews 2.1KLikes 12Bookmarks 11
Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

2dViews 115.3KLikes 295Bookmarks 217
Jenna Russell@jennajrussell

Using AI for persuasion writing seems to flattens arguments. What is an op-ed if not to take a unique and personal stance? Great work from my labmates @YekyungKim and @YapeiChang!!

Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

2dViews 2.7KLikes 30Bookmarks 3

@YekyungKim interesting! we have fairly consistent findings in our preprint here and we find that post-training is one reason: https://arxiv.org/pdf/2605.27878 CC @profjamesevans

2dViews 445Likes 9Bookmarks 3
Sreeram Kannan@sreeramkannan

I don’t think it’s intrinsic to llms.

LLMs can model many personalities in the ensemble. But without adequate directional context they will resort to convergent thinking (modeling the average users next token or feedback).

Right now humans can supply the directional context but when setup with their own evolutionary game (agents have their own money / property), the ones with non convergent thinking or prompts or soul-md will have better survival.

1dViews 866Bookmarks 3

Excellent work !!!

Yekyung Kim@YekyungKim

From op-eds in newspapers to NeurIPS position papers, AI is increasingly shaping long-form public discourse. Its arguments seem plausible, but beneath surface fluency, we find argument collapse: different LLMs converge to the same main & supporting arguments and structure.

2dViews 941Likes 7Bookmarks 1

No.

I disagree.

And why don’t you say the real reason you are making this crap up… Mr. Assistant professor.

Tuition money dropping, so lie about AI.

Shout “argument collapse” louder so when everyone finds out you’re full of it, it’s loud.

I am so tired of universities openly lying and misrepresenting information to gain public opinion. •

2dViews 181Likes 10
Mohit Iyyer@MohitIyyer

@Patty_H93 We didn't evaluate the merits of each argument, but we did extract and analyze high-level characteristics of LLM arguments vs. human arguments. See the quoted tweet:

2dViews 484Likes 2Bookmarks 1
Brian Cheong@briancheong

@MohitIyyer Argument collapse feels like the writing equivalent of mode collapse. The weird part is that it can look diverse at the sentence level while converging on the same thesis.

2dViews 144Likes 2Bookmarks 1
Patty@Patty_H93

@MohitIyyer Are all arguments rated the same? Is it possible the LLMs argument is a better one?

2dViews 410
Suresh@_Suresh2

@MohitIyyer bet the 3% changes with different system prompts

2dViews 258
Mohit Iyyer@MohitIyyer

@_Suresh2 We try different prompts in the paper! You can indeed improve this number if you ask the model to generate N different essays for a given prompt, each with different main arguments. However, many of the LLM arguments in this setting are not ones that humans would make.

2dViews 237Likes 4

@YekyungKim @AnnaRMills This is going to be what I start pointing to when people say they "just use it got idea generation." To me this is exactly backwards. Your ideas are what make you human. Then if anything, just use the AI to fill in the grammar and check for flow

We're using these tools all wrong

1dViews 22Likes 2
Nick Dobos@NickADobos

@emollick Oh no

The AI’s will gamble on us for sport

You don’t need dice when you have human gladiators

Ethan Mollick@emollick

The Matrix idea of keeping humans as batteries is obviously weird... we would be more useful as dice.

LLMs default to very similar kinds of arguments & structure, and even different LLMs seem to collapse to similar concepts. Humans provide a lot more variation in their own work.

1dViews 469Likes 3Bookmarks 0
stringking42069@stringking42069

@MohitIyyer That’s a great point and you’re right to push back on this flattening of discourse generated by llms. It’s not just the creeping sense of bland overly smoothed discussion points but the feeling that this also leads to a subtle creeping feeling of cognitive offloading.

1dViews 128Likes 4
Yekyung Kim@YekyungKim

Even when the central claim is similar, humans support it in more varied ways. Among essays with the same main arguments, 41.0% of supporting arguments extracted from human essays are unique. For LLMs, only 9.1% are.

2dViews 11Likes 2
Load more posts