3h agoAnthropic evaluation data shows Claude Opus 4.8 reduced its 'lazy investigation' fell-for-trap rate to 0% from 91% in Opus 4.5The evaluation measures thoroughness during complex problem-solving tasksSentimentSentimentPos53.8%Neg46.2%Positive users praise Claude Opus 4.8's zero lazy investigation rate for better reliability and session persistence while negative users dismiss the claims after seeing worse performance than earlier versions.13 comments with sentiment. View comments.