Among the other things AIs will get better at is recognizing AI-generated text. Presumably writing generated by this year's models will be easy to catch in a few years. Prediction: This will cause huge scandals in academia, as published papers are later exposed as AI-generated.
Lancet audit of 2.5 million papers finds fabricated citations rose from 4 to 57 per 10,000 after mid-2024
Story Overview
An automated sweep of 2.47 million open-access biomedical papers uncovered a sharp climb in fabricated references that stayed flat near four per 10,000 through 2023 then jumped more than twelvefold after mid-2024, reaching nearly 57 per 10,000 in early 2026.
Timing lines up with LLM rollout yet other causes remain possible
The inflection matches widespread availability of large language models, but the audit authors flag paper-mill activity and indexing changes as alternative explanations and stop short of assigning blame.
Permanent papers now sit exposed to future detection
Paul Graham notes that once better AI detectors arrive, older work generated by today's models could trigger integrity scandals long after publication.
Positive users welcome AI detection advances for catching fake academic papers and cleaning up plagiarism while negative users distrust the motives as part of corrupt publishing and education systems.
No Digg Deeper questions have been answered for this story yet.
Most Activity
Academia is uniquely vulnerable to being caught out in this way, because the test of productivity is publication. So all the evidence is out there, permanently, to be analyzed by ever more sophisticated tools.
Among the other things AIs will get better at is recognizing AI-generated text. Presumably writing generated by this year's models will be easy to catch in a few years. Prediction: This will cause huge scandals in academia, as published papers are later exposed as AI-generated.

@paulg What’s the problem with AI papers? Papers in general are reports. Why AI cannot prepare a report for you? Maybe google spreadsheets cannot be used as well?

@paulg has anyone explained why they're so bad at writing still?

@mathelirium @paulg I think we reached the phase where humans are mimicking AI writing, so who will detect who?🫣🤔

@paulg Arguably the writing itself is the least interesting part of an academic publication. Agree on the rest.
@paulg I think I'd bet against 'huge' scandals, depending on exactly what that means. I think there'll be scandals, but too many for any of them to be that huge. It's ambiguous, people will make excuses. The world will lower its opinion of academia but I don't expect many heads to roll
Academia is uniquely vulnerable to being caught out in this way, because the test of productivity is publication. So all the evidence is out there, permanently, to be analyzed by ever more sophisticated tools.

@paulg Do you have any technical preview on why that could be the case?, or is it a prediction based on how models progress?
Asking as to me I don’t think that is possible.

@paulg You can already do this via tools like @GPTZeroAI
@paulg Have you seen this? It isn't just the content but the references used to support the content. https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(26)00603-3/fulltext
Academia is uniquely vulnerable to being caught out in this way, because the test of productivity is publication. So all the evidence is out there, permanently, to be analyzed by ever more sophisticated tools.

@paulg Not sure how valuable it is to recognize AI text. I can write a text that sounds very AI-like and get a false positive. Is this gonna lead to people intentionally changing their writing styles to be acutely non-AI sounding? And what happens when new models pick up that style too?

@paulg Writing and reading AI-generated text is like eating the menu instead of the meal. Painful for both ends. I believe in pain.

@alexpotato @brunodagnino @paulg Yeah I wouldn't be surprised if the inevitable scandal's resolution ended up being, "yeah we used AI to help write the paper so we could better communicate the findings of our actual work, our real job, and there's nothing wrong that".
Except in postmodern literary criticism lol

@xeno_form Most writing. And, interestingly, most art too.

@brunodagnino If you mean the writing as distinct from the ideas, there is no such thing. And indeed this will be the most scandalous aspect of the scandal: the extent to which the ideas in papers are due to AIs rather than the authors claiming credit for them.

@paulg If an AI can detect that the text is AI generated, that can be used as an oracle to reinforce a new model being trained to produce output that escapes detection. It's an arms race.

One annoying part is that these models are basically being taught to think clearly by copying the writing of analytic philosophers, who think exceptionally clearly. Then, everyone is going to look back and think analytic philosophy papers were all written by AI, even though it was the other way around. This will happen partly because no one with influence knows what analytic philosophy even is, so they'll lose the PR battle. Similar to how the em dash was hijacked by LLMs, which philosophers also always used much more than usual.

@paulg I'm not so sure. Right now overfitting seems like a big issue. It's easy for a detection to latch on to something that stands out in the general case, but not always, like fancy writing. Or some idiosyncrasies, like EM dashes, which real people also produce.

@maintcraft That statement will be true iff it's a tautology.

@paulg have you seen extropic ?
you can call total BS on their whitepaper TODAY

What actually bothers people about AI-generated content?
Is it the fact that a model produced it, or is it the kind of writing that "looks complete, but you can tell nobody was actually thinking"?
If the output is genuinely valuable, and the person is willing to stand behind the ideas and take responsibility for them, does it still matter how much AI helped write it?