Mor Geva and Tuhin Chakrabarty argue LLM-generated peer reviews are too chaotic, filled with hallucinations, and increase author workloads

VIEWS712LIKES14

Reviews can seem very detailed but in practice there's little information there. The summary is often full of fluff so it's really just hard to understand what the paper is about. They often provide long lists of issues, and it's very hard to understand if the concerns are major or minor. They often make non realistic or nonsensical suggestions. All this without mentioning mistakes and hallucinations, which are often conveyed with such confidence that can bias the whole evaluation. The final scores are often borderline so no information there as well.

As author, it creates lots of work which is either stupid or not feasible. As AC it's a nightmare, I want to get a concrete evaluation to work with and I get this noisy not informative text. In rebuttals things become a joke.. the person who generated the review has no idea how to judge what's going on and so typically you get a short statement like "I read the response and decided to keep my score".

5h712140

REPLIES2

Tuhin Chakrabarty@TuhinChakr

@megamor2 @delliott @ipeirotis

Mor Geva@megamor2

Reviews can seem very detailed but in practice there's little information there. The summary is often full of fluff so it's really just hard to understand what the paper is about. They often provide long lists of issues, and it's very hard to understand if the concerns are major or minor. They often make non realistic or nonsensical suggestions. All this without mentioning mistakes and hallucinations, which are often conveyed with such confidence that can bias the whole evaluation. The final scores are often borderline so no information there as well.

As author, it creates lots of work which is either stupid or not feasible. As AC it's a nightmare, I want to get a concrete evaluation to work with and I get this noisy not informative text. In rebuttals things become a joke.. the person who generated the review has no idea how to judge what's going on and so typically you get a short statement like "I read the response and decided to keep my score".

2h6600

Desmond Elliott@delliott

@megamor2 Can you say more about the issues you are seeing in the LLM-generated reviews?

7h2551

Tuhin Chakrabarty@TuhinChakr

I echo very much with @megamor2 on the nitpicking thing and the long list of fluff ( because it’s been trained to produce verbiage).Lol it’s infuriating that LLMs present a minor mostly ignorable flaw as something crucial and the burden is then on author and editors to review it.

Mor Geva@megamor2

Reviews can seem very detailed but in practice there's little information there. The summary is often full of fluff so it's really just hard to understand what the paper is about. They often provide long lists of issues, and it's very hard to understand if the concerns are major or minor. They often make non realistic or nonsensical suggestions. All this without mentioning mistakes and hallucinations, which are often conveyed with such confidence that can bias the whole evaluation. The final scores are often borderline so no information there as well.

As author, it creates lots of work which is either stupid or not feasible. As AC it's a nightmare, I want to get a concrete evaluation to work with and I get this noisy not informative text. In rebuttals things become a joke.. the person who generated the review has no idea how to judge what's going on and so typically you get a short statement like "I read the response and decided to keep my score".

1h40130

Taha 🫡@taha_moji

@megamor2 What is the most insightful public discussion notes or conversations in this topic?

5h72