3h ago

Lindsay M. Smith and Matt Wiemann launch DiscoverPhysics to evaluate LLM agents on scientific experimentation and physical law discovery

The pipeline tests if LLMs can design scientific experiments.

12144278818.4K

——0——

Original post

#376@PAVEL_IZMAILOVOP

Matt Wiemann@SPACE_BOY_MATT

Can LLMs discover new laws of physics? We present DiscoverPhysics, a pipeline to benchmark LLM agents on experimentation, analysis & discovery. Co-led by @LindsayMSmith3 w/ Peter Melchior,@kdqg1 @andrewgwils @Pavel_Izmailov,Carolina Cuesta-Lázaro https://arxiv.org/abs/2605.26087 1/10

7:28 AM · May 28, 2026

Reposted by

#376@PAVEL_IZMAILOV

QUOTE POST

#154Andrew Gordon Wilson@ANDREWGWILS

So excited about this project. Despite all the talk about AGI, AI has barely scratched the surface of discovering scientific theories or even giving us new scientific insights. DiscoverPhysics is a benchmark for the future.

Matt Wiemann@Space_Boy_Matt

Can LLMs discover new laws of physics? We present DiscoverPhysics, a pipeline to benchmark LLM agents on experimentation, analysis and discovery. https://arxiv.org/abs/2605.26087 Co-led by @LindsayMSmith3 w/ Peter Melchior, @kdqg1 @andrewgwils @Pavel_Izmailov Carol Cuesta-Lazaro 1/10

3:17 PM · May 28, 2026 · 9.2K Views

3:43 PM · May 28, 2026 · 7.2K Views

POST

#376Pavel Izmailov@PAVEL_IZMAILOV

Very excited to release DiscoverPhysics, a new benchmark and evaluation pipeline for experimentation and discovery in LLMs.

🌐 https://sampsonml.github.io/DiscoverPhysicsLeaderboard/ 📰 http://arxiv.org/abs/2605.26087

3:47 PM · May 28, 2026 · 1.3K Views

#376Pavel Izmailov@PAVEL_IZMAILOV

We design interactive worlds, where the model can run experiment with the goal to figure out how the world works. I believe this methodology can scale to extremely complex worlds, making this task potentially superhuman.

Pavel Izmailov@Pavel_Izmailov

Very excited to release DiscoverPhysics, a new benchmark and evaluation pipeline for experimentation and discovery in LLMs. 🌐 https://sampsonml.github.io/DiscoverPhysicsLeaderboard/ 📰 http://arxiv.org/abs/2605.26087

3:47 PM · May 28, 2026 · 1.3K Views

3:47 PM · May 28, 2026 · 466 Views

QUOTE POST

#376Pavel Izmailov@PAVEL_IZMAILOV

with awesome collaborators @Space_Boy_Matt @LindsayMSmith3 Peter Melchior @kdqg1 @andrewgwils and Carol Cuesta-Lazaro.

See also Matt's detailed thread with lots of interesting results:

Matt Wiemann@Space_Boy_Matt

3:17 PM · May 28, 2026 · 9.2K Views

3:47 PM · May 28, 2026 · 316 Views

QUOTE POST

#376Pavel Izmailov@PAVEL_IZMAILOV

with awesome collaborators @Space_Boy_Matt @LindsayMSmith3 Peter Melchior @kdqg1 @andrewgwils @Pavel_Izmailov and Carol Cuesta-Lazaro.

See also Matt's detailed thread with lots of interesting results:

3:45 PM · May 28, 2026 · 166 Views

QUOTE POST

#612Ravid Shwartz Ziv@ZIV_RAVID

Very cool, I wonder how sensitive the results are to the data format? Humans are really good at getting an intuition of the laws of physics and not the exact numbers from vision, and only after it to try to verify it with experiments

Matt Wiemann@Space_Boy_Matt

3:17 PM · May 28, 2026 · 9.2K Views

3:58 PM · May 28, 2026 · 632 Views