Why AI Can Now Make Discoveries - my conversation with @danintheory, Lead of the Foundations of Reinforcement Learning team at @OpenAI
00:00 Intro: AI's wild week in mathematics
01:21 What OpenAI's Foundations of RL team does
03:08 Dan's journey: from black holes and quantum gravity to frontier AI
07:04 Are AI systems becoming useful for real science
08:21 The AI math moment: Erdős, OpenAI, DeepMind, and Anthropic
08:52 Why the OpenAI result was an act of exploration
10:25 OpenAI vs. DeepMind: informal reasoning vs. formal proof
12:13 RL 101: learning by doing, not just watching
15:10 Why reinforcement learning works
15:58 How RL breaks: sparse feedback and long-horizon tasks
17:03 RLHF: how human feedback shaped early language models
18:48 Move 37, self-play, and the search for novel strategies
22:16 Explore vs. exploit in scientific discovery
24:49 Why RL may now be "the cake," not the cherry on top
25:46 Why RL started working with large language models
27:29 Is RL "sucking supervision through a straw"?
28:47 Why language may be the grounding layer for intelligence
31:46 A contrarian take on the Bitter Lesson
32:41 What test-time compute actually is
34:50 How RL gives models the ability to think
35:40 Verifiable rewards, math, coding, and the messy real world
38:00 What physics can teach us about AI
42:08 Is there a thermodynamics of AI?
43:08 From Erdős problems to Einstein-level AI
45:16 Is AI already doing original science?
45:51 How far are we from AI automating AI research
47:41 Why Dan is excited about the future of science