/Tech22h ago

New Paper Shows Hallucination Detectors Often Ignore Reasoning

2533317

#936

Original post

Jessy Li#936

GZ The Georgia Tech Basketball Fan@Gz_The_Divine

🚨New paper! Your hallucination detector says it evaluates reasoning. But what if it's just peeking at the final answer? We tested this: keep the reasoning, only change the answer. Many detectors' scores shift dramatically 🧵

2:12 PM · Jun 9, 2026 · 317 Views

/Tech22h ago

New Paper Shows Hallucination Detectors Often Ignore Reasoning

2533317

#936

Original post

Jessy Li#936

GZ The Georgia Tech Basketball Fan@Gz_The_Divine

2:12 PM · Jun 9, 2026 · 317 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS44LIKES2

GZ The Georgia Tech Basketball Fan@Gz_The_Divine

Paper: https://arxiv.org/abs/2605.08346

Work done w/Minh Vu, @HongliZhan, @liraymond96, & Manish Bhattarai.

22h442

REPLIES1

GZ The Georgia Tech Basketball Fan@Gz_The_Divine

TRACT stays stable under both FORCE and REMOVE since it scores the reasoning body, not the endpoint. It also stacks well: fusing TRACT with existing detectors gives +5 to +20 average AUC across all 5 models.

22h332

GZ The Georgia Tech Basketball Fan@Gz_The_Divine

We ran this across 4 benchmarks and 5 models. Some detectors swing 20+ AUC points just from changing or removing answer cues, even though the reasoning is untouched.

22h281

GZ The Georgia Tech Basketball Fan@Gz_The_Divine

Here's what we did: FORCE: replace the final answer with the ground truth; REMOVE: delete the answer step entirely.

Same reasoning body both times. A trace-faithful detector should remain informative under both.

22h271

GZ The Georgia Tech Basketball Fan@Gz_The_Divine

So we asked: what does the reasoning itself look like when it's going wrong? It wanders, hedges, grows uneven, or diverges across samples. We built TRACT to pick up on these trajectory patterns as a lightweight text-only score.

22h221