/AI9h ago

New Paper Shows Hallucination Detectors Often Ignore Reasoning

2533191
Original postJessy Li#860

馃毃New paper! Your hallucination detector says it evaluates reasoning. But what if it's just peeking at the final answer? We tested this: keep the reasoning, only change the answer. Many detectors' scores shift dramatically 馃У

2:12 PM 路 Jun 9, 2026 路 199 Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS44LIKES2

Paper: https://arxiv.org/abs/2605.08346

Work done w/Minh Vu, @HongliZhan, @liraymond96, & Manish Bhattarai.

9hViews 44Likes 2
REPLIES1

TRACT stays stable under both FORCE and REMOVE since it scores the reasoning body, not the endpoint. It also stacks well: fusing TRACT with existing detectors gives +5 to +20 average AUC across all 5 models.

9hViews 33Likes 2

We ran this across 4 benchmarks and 5 models. Some detectors swing 20+ AUC points just from changing or removing answer cues, even though the reasoning is untouched.

9hViews 28Likes 1

Here's what we did: FORCE: replace the final answer with the ground truth; REMOVE: delete the answer step entirely.

Same reasoning body both times. A trace-faithful detector should remain informative under both.

9hViews 27Likes 1

So we asked: what does the reasoning itself look like when it's going wrong? It wanders, hedges, grows uneven, or diverges across samples. We built TRACT to pick up on these trajectory patterns as a lightweight text-only score.

9hViews 22Likes 1