22h agoStanford NLP's Peter Hase says LLMs fail to detect internal state tampering in gaslight control experimentsModels could not distinguish inputs from injected hidden representations.SentimentSentimentPos50%Neg50%Positive users commend the study for raising the right evidentiary bar on LLM introspection claims, while negative users criticize the work for engaging with or debunking anthropomorphic narratives about AI.8 comments with sentiment. View comments.