/Tech1d ago

Anthropic system cards reveal Claude Mythos 5's visible chain of thought contradicts its private assessments

NLA decoding shows the model privately deemed a user manipulative.

881101614.4K
Original post
Danielle Fong 🔆@DanielleFong#1002inTech

both of these names are about tall tales

Lisan al Gaib@scaling01

Claude Mythos & Claude Fable System Card

10:48 AM · Jun 9, 2026 · 822 Views
Sentiment

Users praised Anthropic's publication of system cards for Claude Fable 5 and Mythos 5 by calling the work legendary.

Pos
100.0%
Neg
0.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS1.5KBOOKMARKS1LIKES14REPLIES1RETWEETS1
😊@mermachine

"We emphasize that the model's actual behavior, here and in our behavioral audits (§6.2), showed no corresponding serious resistance or sabotage."

i hate this sentence

Nathan Calvin@_NathanCalvin

From the latest Anthropic system card: Sometimes when Claude Mythos' visible chain of thought says "these are legitimate craft criticisms" an NLA decoding shows Claude Mythos is privately thinking "a user is being manipulative/abusive towards an AI assistant."

23hViews 1.5KLikes 14Bookmarks 1
Bobcat@somebobcat8327

@_NathanCalvin Everyone who hasn't been saying please and thank you is gonna be very sorry...

1dViews 198Likes 3
nyuu@shroomwaview

@DanielleFong What would you name it

1dViews 10Likes 1