/Tech5h ago

OpenAI Research Chief Discusses Scaling Laws Pretraining And Evals Crisis

56267211.1K

Original post

In this episode, @OpenAI Chief Research Officer @markchen90 joins @allenpark to flambé shrimp, cook Korean stew, and chat about being at the frontier of AI research: why scaling laws and pre-training still matter, how OpenAI chooses research bets and allocates compute, what it means to develop research taste, why evals are in crisis, how to avoid benchmark-maxing, and what it will take for models to handle long-horizon real-world work, multimodal reasoning, and eventually end-to-end AI research.

Timestamps: 0:00 Intro 0:28 The Soup Story 1:52 From Trading to AI Research 3:21 How to Develop Research Taste 5:23 RL, Evals, and Superhuman Benchmarks 8:17 Cooking Begins on the Impulse Stove 8:53 Scaling Laws, Pre-Training, and Reasoning 12:33 OpenAI’s Research Roadmap and Compute Allocation 15:48 What Makes a Great Researcher 19:33 The Evals Crisis and Benchmark-Maxing 24:34 Jagged Intelligence, Context, and Long-Horizon Learning 27:14 Shrimp Flambé and New Research Bets 31:32 Multimodal Models and One Architecture 32:36 Vibe Researching and End-to-End AI Research 34:36 Failed Bets, Postmortems, and OpenAI’s Alpha 37:07 Final Taste Test 37:53 Overrated vs. Underrated AI Research 41:00 Closing

2:41 PM · Jun 25, 2026 · 8.6K Views

Sentiment

Users praise the casual interview format with OpenAI's research chief for revealing genuine unpolished insights and call him a star.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

YOUTUBE.COMVia

Posts from X

Most Activity

VIEWS357RETWEETS1

Lindsay McCallum Rémy@lindsmccallum

taking "@openai is cooking" to a new level.

our chief research officer @markchen90 loves to cook so when @swyx and @allenpark started a new show, there was only one thing to do.

https://www.youtube.com/watch?v=fpAthTtha8c&t=5s

1h3.6K166

LIKES1

swyx 🔜 @aiDotEngineer@swyx

@adelwu_ @latentspacepod @OpenAI @markchen90 @allenpark ANSWER MY TEXTS ADEL

2h501

REPLIES1

adel 🌟@adelwu_

@latentspacepod @swyx @OpenAI @markchen90 @allenpark he’s a STAR @allenpark

3h162

PsudoMike 🇨🇦@PsudoMike

@latentspacepod @OpenAI @markchen90 @allenpark The compute allocation choices labs make now determine what reaches the API 12 months out. For those building on the API layer, whether the bets are on reasoning, context windows, or speed matters a lot. Hard to plan around a roadmap that's genuinely uncertain.

4h39

andrea tateshestein pt 2@fsp1219187

@latentspacepod @swyx @OpenAI @markchen90 @allenpark Smart format: research leaders are overtrained on polished answers, but under casual constraints you get the real model card—taste, tradeoffs, and what they dodge.

3h10

Divinmentis@Divinmentis

@latentspacepod @OpenAI @markchen90 @allenpark The compute-allocation angle is the strategic signal. Frontier research is not only model ideas; it is choosing which bets get scarce training/inference budget, eval attention, safety review, and deployment bandwidth. Governance becomes operations.

2h2