Researchers debate enforcement policies for arXiv LLM submissions

REPLY

Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments") end/

Thomas G. Dietterich@tdietterich

The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue. 4/

7:03 PM · May 14, 2026 · 199.1K Views

7:03 PM · May 14, 2026 · 65.4K Views

REPLY

#72Thomas G. Dietterich@TDIETTERICH

@roydanroy @SovereignJap Is it? The slop detectors are very noisy. Would you want your paper falsely labeled as "high likelihood of slop"? Lawsuits would swiftly follow, I can assure you

3:22 PM · May 15, 2026 · 157 Views

REPLY

#73Zico Kolter@ZICOKOLTER

@DimitrisPapail @tdietterich I think we mostly agree (and personally, it's not clear to me than banning almost-fully-AI-generated submissions is even good policy). But practically speaking, I think these levers will be used sparingly and likely only for extremely egregious slop posters.

Dimitris Papailiopoulos@DimitrisPapail

I love arxiv, and it's been an incredible resource for science. The LLM slop fight is unwinnable though, and will put an incredible additional burden on the maintainers, will create many slippery slopes, and will frustrate authors. Also, perhaps the oddity in all this is-- if hallucinated refs are the issue--one could in fact check for validity of references... with claude code or codex :)

4:01 PM · May 15, 2026 · 2.2K Views

4:14 PM · May 15, 2026 · 440 Views

REPLY

#203Dimitris Papailiopoulos@DIMITRISPAPAIL

I love arxiv, and it's been an incredible resource for science. The LLM slop fight is unwinnable though, and will put an incredible additional burden on the maintainers, will create many slippery slopes, and will frustrate authors. Also, perhaps the oddity in all this is-- if hallucinated refs are the issue--one could in fact check for validity of references... with claude code or codex :)

Zico Kolter@zicokolter

@DimitrisPapail I see your points, but I think you may also be discounting just how curated Arxiv already is. @tdietterich and others reject a ton of low-quality submissions. There are problems with the LLM proposal, but the mods want to maintain something similar to the current quality bar.

3:58 PM · May 15, 2026 · 2.9K Views

4:01 PM · May 15, 2026 · 2.2K Views

QUOTE POST

#215Stella Biderman @ ICLR@BLANCHEMINERVA

The median social system breaks under too much optimization pressure, and we should stop trying to optimize things.

Daniel Litt@littmath

I think this is yet another example of problems surfaced by LLMs actually reflecting deep flaws in our institutions—in this case, that many of our ways of evaluating work are through imperfect, goodharted proxies rather than engaging with the work itself.

12:47 PM · May 15, 2026 · 10.7K Views

3:28 PM · May 15, 2026 · 1K Views

REPLY

#215Stella Biderman @ ICLR@BLANCHEMINERVA

I believe many things get worse when you try to optimize them because the underlying assumptions aren’t robust to the amount of computational power we are able to leverage.

Stella Biderman @ ICLR@BlancheMinerva

The median social system breaks under too much optimization pressure, and we should stop trying to optimize things.

3:28 PM · May 15, 2026 · 1K Views

3:28 PM · May 15, 2026 · 391 Views

REPLY

#236rishi@RISHIBOMMASANI

Is this referring to the rendered PDF or the LaTeX source? I certainly have papers where we didn't strip all the human feedback and so it is in the LaTeX source, which perhaps is not ideal but certainly doesn't feel particularly harmful/warranting of this penalty. Not sure why LM comments warrant such a drastic difference?

Thomas G. Dietterich@tdietterich

Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments") end/

7:03 PM · May 14, 2026 · 65.4K Views

10:06 PM · May 14, 2026 · 1.9K Views

QUOTE POST

#663Chenhao Tan@CHENHAOTAN

Totally agree, we need much more thorough checks on papers.

Daniel Litt@littmath

Re: arxiv LLM policies, it is now trivial to catch hallucinated citations, obvious LLM “if you’d like I can etc.” text, and so on, *by using current-gen LLMs*. What we really want is for output to be proof-of-thought, for which the mere existence of a paper no longer suffices.

12:44 PM · May 15, 2026 · 21.6K Views

8:56 PM · May 15, 2026 · 693 Views

ORIGINAL POST

#721Daniel Litt@LITTMATH

Re: arxiv LLM policies, it is now trivial to catch hallucinated citations, obvious LLM “if you’d like I can etc.” text, and so on, *by using current-gen LLMs*. What we really want is for output to be proof-of-thought, for which the mere existence of a paper no longer suffices.

12:44 PM · May 15, 2026 · 21.6K Views

REPLY

#721Daniel Litt@LITTMATH

I think this is yet another example of problems surfaced by LLMs actually reflecting deep flaws in our institutions—in this case, that many of our ways of evaluating work are through imperfect, goodharted proxies rather than engaging with the work itself.

Daniel Litt@littmath

Re: arxiv LLM policies, it is now trivial to catch hallucinated citations, obvious LLM “if you’d like I can etc.” text, and so on, *by using current-gen LLMs*. What we really want is for output to be proof-of-thought, for which the mere existence of a paper no longer suffices.

12:44 PM · May 15, 2026 · 21.6K Views

12:47 PM · May 15, 2026 · 10.7K Views

REPLY

#1647Quentin Berthet@QBERTHET

@roydanroy Is a paper with no references but a "sample text that may or may not come from an LLM of papers that may or may not exist" an issue? Is a paper with no references an issue?

This only works if there is an agreed upon list of "papers that exist" each with a unique reference number

Quentin Berthet@qberthet

@roydanroy Also what is the definition of a hallucinated reference? Is the latex compiler accidentally putting two NeurIPS editors as authors an hallucination? Is adding a reference in v2 following a reviewer request an hallucination if it doesn't exist?

10:39 AM · May 15, 2026 · 803 Views

10:41 AM · May 15, 2026 · 262 Views

REPLY

#1647Quentin Berthet@QBERTHET

@roydanroy And an agreed upon definition of what counts as a reference.

Also (personal opinion) I think that this is a very easy objective for LLMs to be finetuned on, and much like "look at the hands" will serve as a detection tool for about five minutes

Quentin Berthet@qberthet

@roydanroy Is a paper with no references but a "sample text that may or may not come from an LLM of papers that may or may not exist" an issue? Is a paper with no references an issue? This only works if there is an agreed upon list of "papers that exist" each with a unique reference number

10:41 AM · May 15, 2026 · 262 Views

10:43 AM · May 15, 2026 · 121 Views

REPLY

#1900Luca Ambrogioni@LUCAAMB

@tdietterich This is way too strict. Errors can slip in when using any tools. We aren't perfect

Having a prompt left in is a mistake, it's sloppy but giving permanent answer a one time sloppiness is absurd

Thomas G. Dietterich@tdietterich

Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments") end/

7:03 PM · May 14, 2026 · 65.4K Views

7:41 AM · May 15, 2026 · 376 Views

Researchers debate enforcement policies for arXiv LLM submissions

Cluster engagement

Sentiment

Cluster engagement