3h ago

Lindsay M. Smith and Matt Wiemann launch DiscoverPhysics to evaluate LLM agents on scientific experimentation and physical law discovery

The pipeline tests if LLMs can design scientific experiments.

0
Original post

Can LLMs discover new laws of physics? We present DiscoverPhysics, a pipeline to benchmark LLM agents on experimentation, analysis & discovery. Co-led by @LindsayMSmith3 w/ Peter Melchior,@kdqg1 @andrewgwils @Pavel_Izmailov,Carolina Cuesta-Lázaro https://arxiv.org/abs/2605.26087 1/10

7:28 AM · May 28, 2026 View on X
Reposted by

So excited about this project. Despite all the talk about AGI, AI has barely scratched the surface of discovering scientific theories or even giving us new scientific insights. DiscoverPhysics is a benchmark for the future.

Matt WiemannMatt Wiemann@Space_Boy_Matt

Can LLMs discover new laws of physics? We present DiscoverPhysics, a pipeline to benchmark LLM agents on experimentation, analysis and discovery. https://arxiv.org/abs/2605.26087 Co-led by @LindsayMSmith3 w/ Peter Melchior, @kdqg1 @andrewgwils @Pavel_Izmailov Carol Cuesta-Lazaro 1/10

3:17 PM · May 28, 2026 · 9.2K Views
3:43 PM · May 28, 2026 · 7.2K Views

Very excited to release DiscoverPhysics, a new benchmark and evaluation pipeline for experimentation and discovery in LLMs.

🌐 https://sampsonml.github.io/DiscoverPhysicsLeaderboard/ 📰 http://arxiv.org/abs/2605.26087

3:47 PM · May 28, 2026 · 1.3K Views

We design interactive worlds, where the model can run experiment with the goal to figure out how the world works. I believe this methodology can scale to extremely complex worlds, making this task potentially superhuman.

Pavel IzmailovPavel Izmailov@Pavel_Izmailov

Very excited to release DiscoverPhysics, a new benchmark and evaluation pipeline for experimentation and discovery in LLMs. 🌐 https://sampsonml.github.io/DiscoverPhysicsLeaderboard/ 📰 http://arxiv.org/abs/2605.26087

3:47 PM · May 28, 2026 · 1.3K Views
3:47 PM · May 28, 2026 · 466 Views

with awesome collaborators @Space_Boy_Matt @LindsayMSmith3 Peter Melchior @kdqg1 @andrewgwils and Carol Cuesta-Lazaro.

See also Matt's detailed thread with lots of interesting results:

Matt WiemannMatt Wiemann@Space_Boy_Matt

Can LLMs discover new laws of physics? We present DiscoverPhysics, a pipeline to benchmark LLM agents on experimentation, analysis and discovery. https://arxiv.org/abs/2605.26087 Co-led by @LindsayMSmith3 w/ Peter Melchior, @kdqg1 @andrewgwils @Pavel_Izmailov Carol Cuesta-Lazaro 1/10

3:17 PM · May 28, 2026 · 9.2K Views
3:47 PM · May 28, 2026 · 316 Views

with awesome collaborators @Space_Boy_Matt @LindsayMSmith3 Peter Melchior @kdqg1 @andrewgwils @Pavel_Izmailov and Carol Cuesta-Lazaro.

See also Matt's detailed thread with lots of interesting results:

3:45 PM · May 28, 2026 · 166 Views

Very cool, I wonder how sensitive the results are to the data format? Humans are really good at getting an intuition of the laws of physics and not the exact numbers from vision, and only after it to try to verify it with experiments

Matt WiemannMatt Wiemann@Space_Boy_Matt

Can LLMs discover new laws of physics? We present DiscoverPhysics, a pipeline to benchmark LLM agents on experimentation, analysis and discovery. https://arxiv.org/abs/2605.26087 Co-led by @LindsayMSmith3 w/ Peter Melchior, @kdqg1 @andrewgwils @Pavel_Izmailov Carol Cuesta-Lazaro 1/10

3:17 PM · May 28, 2026 · 9.2K Views
3:58 PM · May 28, 2026 · 632 Views
Lindsay M. Smith and Matt Wiemann launch DiscoverPhysics to evaluate LLM agents on scientific experimentation and physical law discovery · Digg