I think @BronsonSchoen 's and @j_nitishinskaya 's metagaming post is super underrated.
There are just so many interesting findings in there about how models think about the grader and the setting they are in. And so many good ablations.
Next up: the most-discussed papers at Recursive
1. Anthropic's Persona Selection Model 2. @apolloaievals' "metagaming" work 3. OpenAI on the impact of training on CoT 4. Anthropic's new Natural-language autoencoders 5. Redwood Research's Plans A/B/C/D