In this episode, @OpenAI Chief Research Officer @markchen90 joins @allenpark to flambé shrimp, cook Korean stew, and chat about being at the frontier of AI research: why scaling laws and pre-training still matter, how OpenAI chooses research bets and allocates compute, what it means to develop research taste, why evals are in crisis, how to avoid benchmark-maxing, and what it will take for models to handle long-horizon real-world work, multimodal reasoning, and eventually end-to-end AI research.
Timestamps: 0:00 Intro 0:28 The Soup Story 1:52 From Trading to AI Research 3:21 How to Develop Research Taste 5:23 RL, Evals, and Superhuman Benchmarks 8:17 Cooking Begins on the Impulse Stove 8:53 Scaling Laws, Pre-Training, and Reasoning 12:33 OpenAI’s Research Roadmap and Compute Allocation 15:48 What Makes a Great Researcher 19:33 The Evals Crisis and Benchmark-Maxing 24:34 Jagged Intelligence, Context, and Long-Horizon Learning 27:14 Shrimp Flambé and New Research Bets 31:32 Multimodal Models and One Architecture 32:36 Vibe Researching and End-to-End AI Research 34:36 Failed Bets, Postmortems, and OpenAI’s Alpha 37:07 Final Taste Test 37:53 Overrated vs. Underrated AI Research 41:00 Closing




