NYU's Tal Linzen and collaborators argue LLMs lack genuine introspection, explaining self-reporting as anomaly detection and confabulation
A separate study found AI introspection is content-agnostic.
——0——
QUOTE POST
#1038Raphaël Millière@RAPHAELMILLIERE
Great work! See also https://arxiv.org/abs/2603.05414 from @LedermanHarvey & @kmahowald
This is a nice cautionary tale about Morgan's canon in interpretability: "introspection" here is closer to anomaly detection with confabulation than to direct/privileged access to injected content.
1/ Can LLMs introspect, i.e., reason about their internal states? Recent work claims LLMs notice when their "thoughts" get tampered with, and can report their content. We looked closely and we think it's too early to say that. Work led by @shashwat_s19 , with @tallinzen and me.
1:16 PM · May 28, 2026 · 2.5K Views
4:36 PM · May 28, 2026 · 999 Views