Haotian Ye introduced SimpleTES, a framework that scales scientific discovery through evaluation-driven propose-evaluate-refine loops with large language models rather than extra tokens or agents
It delivered new state-of-the-art solutions on 21 open problems.
Scaling evaluations—not just compute—is critical for AI-driven science.
SimpleTES introduces a new framework to scale discovery loops, finding new SOTA solutions across 21 open science problems.
Including: • >2× faster LASSO algorithm • more efficient quantum routing + more!
Great work led by @haotian_yeee and wonderful collaborators!
🚀 Today, we’re excited to introduce SimpleTES for scaling the scientific discovery loop. 🧵 I always ask myself: what are we actually scaling in scientific discovery? Most LLM discovery methods focus on test-time scaling generation — more tokens, more agents, more turns. But science advances through the evaluation-driven loops: propose → evaluate → refine → repeat. SimleTES captures this idea, discovering SOTA solutions across 21 scientific problems! Key discoveries: 🏎️ 2.17x faster lasso solver than glmnet — the gold-standard LASSO solver, engineered for decades. ⚛️ 24.5% fewer quantum routing overhead on IBM Q20 — superior than previous standard library LightSABRE. 📐 0.380868 on Erdős Minimum Overlap — outperforming previous solutions from mixed-frontier ensembles or humans. 🧬 0.74 on Tabula Muris (scRNA-seq denoising) — new SOTA, generalizing to unseen tissue types without retraining. #LLM #AI4Science #ScalingLaws #SimpleTES #MachineLearning
Congrats @haotian_yeee and team on SimpleTES and on discovering 21 SOTA solutions across 6 scientific problems! 🚀🔬
What I find especially exciting is the shift from scaling generation to scaling the full scientific discovery loop: propose → evaluate → refine → repeat. 🔁 By making evaluation signals the core driver of test-time search,
SimpleTES points to a compelling path toward more systematic, evaluation-driven AI for science. 🚀
🚀 Today, we’re excited to introduce SimpleTES for scaling the scientific discovery loop. 🧵 I always ask myself: what are we actually scaling in scientific discovery? Most LLM discovery methods focus on test-time scaling generation — more tokens, more agents, more turns. But science advances through the evaluation-driven loops: propose → evaluate → refine → repeat. SimleTES captures this idea, discovering SOTA solutions across 21 scientific problems! Key discoveries: 🏎️ 2.17x faster lasso solver than glmnet — the gold-standard LASSO solver, engineered for decades. ⚛️ 24.5% fewer quantum routing overhead on IBM Q20 — superior than previous standard library LightSABRE. 📐 0.380868 on Erdős Minimum Overlap — outperforming previous solutions from mixed-frontier ensembles or humans. 🧬 0.74 on Tabula Muris (scRNA-seq denoising) — new SOTA, generalizing to unseen tissue types without retraining. #LLM #AI4Science #ScalingLaws #SimpleTES #MachineLearning