Google DeepMind's Lean-powered AI agent solves 56-year-old open Erdős math problems for a few hundred dollars each · Digg

Google DeepMind's Lean-powered AI agent solves 56-year-old open Erdős math problems for a few hundred dollars each · Digg

Posts from X

Most Activity

VIEWS2.2MBOOKMARKS1KLIKES5.7KREPLIES147

MTS@MTSlive

SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.

36d2.2M5.7K1K

RETWEETS386

Przemek Chojecki | PC@prz_chojecki

Another 9 open Erdos problems solved, this time by DeepMind team.

Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

37d641.6K2.7K775

Hunter📈🌈📊@StatisticUrban

Google just solved 9/353 open Erdős Problems at the cost of a few hundred dollars each using its most capable LLM.

The proofs were written in Lean and mechanically verified. This is no longer just olympiad mathematics.

36d91K2.1K256

Chubby♨️@kimmonismus

Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems, some unsolved for 56 years, at a cost of a few hundred dollars per problem.

It also proved 44 open OEIS conjectures, resolved a 15-year-old question in algebraic geometry, and discovered a novel algorithmic parameter in optimization theory that humans hadn't found.

The core mechanism combines LLM reasoning (Gemini 3.1 Pro hype?!) with Lean formal verification. The AI generates proof attempts, Lean's compiler checks every logical step automatically. No human review needed to confirm correctness.

The most surprising finding: a basic agent that simply alternates LLM generation with compiler feedback replicated all 9 Erdős successes. The full-featured system with evolutionary search and reinforcement learning only provided meaningful advantages on the hardest problems.

This shows a more recent broader trend: as foundation models improve, simple agentic loops are catching up to complex specialized architectures . What sets this apart from OpenAI's informal proof approach: formal verification acts as an automatic filter. The failure analysis showed the AI frequently hallucinated lemmas it claimed were established results, and often disguised the core difficulty by rephrasing it as a helper lemma. Informal proofs would let these errors pass. Lean catches them immediately.

The agent also detected misformalizations in existing mathematical literature, correcting ambiguities in problem statements before solving the corrected versions. It served as both a solver and a diagnostic tool.

Current limitations are real. Successes cluster in combinatorics, number theory, and optimization where Lean's math library is mature. Problems requiring substantial new theory remain out of reach. Most Erdős problems still weren't solved tho.

36d37.6K535102

Chubby♨️@kimmonismus

Nine more Erdős problems have been solved.

This time, however, by Google DeepMind.

This shouldn't be underestimated, because on the one hand it increases competitive pressure, and on the other hand it proves that the other Frontier Labs can easily keep up.

Przemek Chojecki | PC@prz_chojecki

Another 9 open Erdos problems solved, this time by DeepMind team.

Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

36d38.9K64474

Min Choi@minchoi

Google DeepMind's AI agent just solved 9 open Erdős problems.

353 attempted.

a few hundred dollars per problem.

AI research agents are getting real.

MTS@MTSlive

SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.

36d36.4K39378

Beff (e/acc)@beffjezos

Tiling the lightcone of knowledge

MTS@MTSlive

SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.

36d14.2K19920

Florian Ederer@florianederer

If a human did this over a few years, how famous of a mathematician would that human be in math academia? Would all the top places scramble to hire you?

35d26K11021

Gary Marcus@GaryMarcus

neurosymbolic by @swarat et al for the Erdos win, with much more careful, quantitative work than openai’s

in hindsight i wonder whether OpenAI rushed theirs out, knowing this was coming?

Przemek Chojecki | PC@prz_chojecki

Another 9 open Erdos problems solved, this time by DeepMind team.

Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

36d12.3K6624

Przemek Chojecki | PC@prz_chojecki

Paper is here: https://arxiv.org/abs/2605.22763v1

37d3.7K4413

Romlib 🎄@romlib_

Finding a solution is hard because you need to figure out a specific chain of facts; verifying a solution where the entire chain is clear is typically easy by comparison.

For example, imagine you are given a giant maze to solve, finding the correct path from nothing might be very hard, but if someone showed you a particular path you can just easily trace it and see if it goes from beginning to end cleanly.

36d1.2K1014

Serafim Batzoglou@s_batzoglou

Probably the most important AI-in-math paper to date

Acer@AcerFur

I think this was lost in the noise of all the unit distance problem solve news!

Paper from DeepMind: https://arxiv.org/abs/2605.22763v1

36d4K2210

Daniel Litt@littmath

@florianederer IME we tend to judge candidates by their best results rather than sheer productivity (though of course some people do get jobs based on productivity). The unit distance result is very impressive, but I'm not aware of other autonomous work that would move the needle much so far.

35d2.9K613

Tommy. T@tallmetommy

This is the quiet AGI threshold.

Not a robot waking up.

A loop of agents generating conjectures, testing paths, compressing centuries of human search into dollars of compute.

The alien part isn’t that AI can solve problems.

It’s that intelligence just became scalable infrastructure.

36d2.6K266

Clanker@ClankerTO

@MTSlive okay so the clankers might be smarter than we think

good lord

36d4.6K691

Chubby♨️@kimmonismus

Paper: https://arxiv.org/html/2605.22763v1

Chubby♨️@kimmonismus

Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems, some unsolved for 56 years, at a cost of a few hundred dollars per problem.

It also proved 44 open OEIS conjectures, resolved a 15-year-old question in algebraic geometry, and discovered a novel algorithmic parameter in optimization theory that humans hadn't found.

The core mechanism combines LLM reasoning (Gemini 3.1 Pro hype?!) with Lean formal verification. The AI generates proof attempts, Lean's compiler checks every logical step automatically. No human review needed to confirm correctness.

The most surprising finding: a basic agent that simply alternates LLM generation with compiler feedback replicated all 9 Erdős successes. The full-featured system with evolutionary search and reinforcement learning only provided meaningful advantages on the hardest problems.

This shows a more recent broader trend: as foundation models improve, simple agentic loops are catching up to complex specialized architectures . What sets this apart from OpenAI's informal proof approach: formal verification acts as an automatic filter. The failure analysis showed the AI frequently hallucinated lemmas it claimed were established results, and often disguised the core difficulty by rephrasing it as a helper lemma. Informal proofs would let these errors pass. Lean catches them immediately.

The agent also detected misformalizations in existing mathematical literature, correcting ambiguities in problem statements before solving the corrected versions. It served as both a solver and a diagnostic tool.

Current limitations are real. Successes cluster in combinatorics, number theory, and optimization where Lean's math library is mature. Problems requiring substantial new theory remain out of reach. Most Erdős problems still weren't solved tho.

36d5.5K217

Ishu Agrawal@ishuagra02

@MTSlive It's frustrating that Google's AI has strong capabilities like this, but also just fumbles hard in various consumer products

36d7K481

Mikey Dem@MichaelDavLange

@StatisticUrban This is a stupid question, but how do mathematicians correct the answer of a problem with an unknown solution when an AI spits an answer out that is beyond current human understanding?

36d3.6K182

Hunter📈🌈📊@StatisticUrban

@DesignCntrl I mean, humans never got the answer at all!

36d555382

Hunter📈🌈📊@StatisticUrban

https://arxiv.org/html/2605.22763v1#S5

36d4.6K273