Another 9 open Erdos problems solved, this time by DeepMind team.
Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.
The system also proved 44 open conjectures from OEIS.
Another 9 open Erdos problems solved, this time by DeepMind team.
Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.
Many users expressed excitement about Google DeepMind's AI solving nine open Erdős math problems for its autonomous proof capabilities, while others responded with insults dismissing the claims or attacking posters.

what are the 9 open erdos problems?
The nine are variants or parts of Erdős problems #12 (i & ii), #26, #125, #138, #152, #741 (i & ii), and #846. ¹
The DeepMind paper’s Table 1 details them (e.g., #125 on sumset lower density since 1996; #138 on van der Waerden numbers; two 56-year-old open cases among them), with Lean proofs shared on GitHub. ¹
(The story and public summaries name examples but not the full numbered list.)
SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.
Another 9 open Erdos problems solved, this time by DeepMind team.
Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.
Google just solved 9/353 open Erdős Problems at the cost of a few hundred dollars each using its most capable LLM.
The proofs were written in Lean and mechanically verified. This is no longer just olympiad mathematics.
Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems, some unsolved for 56 years, at a cost of a few hundred dollars per problem.
It also proved 44 open OEIS conjectures, resolved a 15-year-old question in algebraic geometry, and discovered a novel algorithmic parameter in optimization theory that humans hadn't found.
The core mechanism combines LLM reasoning (Gemini 3.1 Pro hype?!) with Lean formal verification. The AI generates proof attempts, Lean's compiler checks every logical step automatically. No human review needed to confirm correctness.
The most surprising finding: a basic agent that simply alternates LLM generation with compiler feedback replicated all 9 Erdős successes. The full-featured system with evolutionary search and reinforcement learning only provided meaningful advantages on the hardest problems.
This shows a more recent broader trend: as foundation models improve, simple agentic loops are catching up to complex specialized architectures . What sets this apart from OpenAI's informal proof approach: formal verification acts as an automatic filter. The failure analysis showed the AI frequently hallucinated lemmas it claimed were established results, and often disguised the core difficulty by rephrasing it as a helper lemma. Informal proofs would let these errors pass. Lean catches them immediately.
The agent also detected misformalizations in existing mathematical literature, correcting ambiguities in problem statements before solving the corrected versions. It served as both a solver and a diagnostic tool.
Current limitations are real. Successes cluster in combinatorics, number theory, and optimization where Lean's math library is mature. Problems requiring substantial new theory remain out of reach. Most Erdős problems still weren't solved tho.
Nine more Erdős problems have been solved.
This time, however, by Google DeepMind.
This shouldn't be underestimated, because on the one hand it increases competitive pressure, and on the other hand it proves that the other Frontier Labs can easily keep up.
Another 9 open Erdos problems solved, this time by DeepMind team.
Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.
Google DeepMind's AI agent just solved 9 open Erdős problems.
353 attempted.
a few hundred dollars per problem.
AI research agents are getting real.
SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.
Tiling the lightcone of knowledge
SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.
If a human did this over a few years, how famous of a mathematician would that human be in math academia? Would all the top places scramble to hire you?
neurosymbolic by @swarat et al for the Erdos win, with much more careful, quantitative work than openai’s
in hindsight i wonder whether OpenAI rushed theirs out, knowing this was coming?
Another 9 open Erdos problems solved, this time by DeepMind team.
Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

Paper is here: https://arxiv.org/abs/2605.22763v1

Finding a solution is hard because you need to figure out a specific chain of facts; verifying a solution where the entire chain is clear is typically easy by comparison.
For example, imagine you are given a giant maze to solve, finding the correct path from nothing might be very hard, but if someone showed you a particular path you can just easily trace it and see if it goes from beginning to end cleanly.
Probably the most important AI-in-math paper to date
I think this was lost in the noise of all the unit distance problem solve news!
Paper from DeepMind: https://arxiv.org/abs/2605.22763v1

@florianederer IME we tend to judge candidates by their best results rather than sheer productivity (though of course some people do get jobs based on productivity). The unit distance result is very impressive, but I'm not aware of other autonomous work that would move the needle much so far.

This is the quiet AGI threshold.
Not a robot waking up.
A loop of agents generating conjectures, testing paths, compressing centuries of human search into dollars of compute.
The alien part isn’t that AI can solve problems.
It’s that intelligence just became scalable infrastructure.

@MTSlive okay so the clankers might be smarter than we think
good lord
Paper: https://arxiv.org/html/2605.22763v1
Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems, some unsolved for 56 years, at a cost of a few hundred dollars per problem.
It also proved 44 open OEIS conjectures, resolved a 15-year-old question in algebraic geometry, and discovered a novel algorithmic parameter in optimization theory that humans hadn't found.
The core mechanism combines LLM reasoning (Gemini 3.1 Pro hype?!) with Lean formal verification. The AI generates proof attempts, Lean's compiler checks every logical step automatically. No human review needed to confirm correctness.
The most surprising finding: a basic agent that simply alternates LLM generation with compiler feedback replicated all 9 Erdős successes. The full-featured system with evolutionary search and reinforcement learning only provided meaningful advantages on the hardest problems.
This shows a more recent broader trend: as foundation models improve, simple agentic loops are catching up to complex specialized architectures . What sets this apart from OpenAI's informal proof approach: formal verification acts as an automatic filter. The failure analysis showed the AI frequently hallucinated lemmas it claimed were established results, and often disguised the core difficulty by rephrasing it as a helper lemma. Informal proofs would let these errors pass. Lean catches them immediately.
The agent also detected misformalizations in existing mathematical literature, correcting ambiguities in problem statements before solving the corrected versions. It served as both a solver and a diagnostic tool.
Current limitations are real. Successes cluster in combinatorics, number theory, and optimization where Lean's math library is mature. Problems requiring substantial new theory remain out of reach. Most Erdős problems still weren't solved tho.

@MTSlive It's frustrating that Google's AI has strong capabilities like this, but also just fumbles hard in various consumer products

@StatisticUrban This is a stupid question, but how do mathematicians correct the answer of a problem with an unknown solution when an AI spits an answer out that is beyond current human understanding?

@DesignCntrl I mean, humans never got the answer at all!

https://arxiv.org/html/2605.22763v1#S5