VIEWS102BOOKMARKS1LIKES5REPLIES1

Charles Foster@CFGeek
METR looked at the raw CoT directly to check if models were reasoning in unintelligible text.
The models did sometimes do weird things in their output or CoT (raw and summarized), but IMO it looked more normal than the screenshots. Here are some examples: https://metr.org/blog/2026-05-19-frontier-risk-report/#nonstandard-language
3hViews 102Likes 5Bookmarks 1