1d ago

Google DeepMind's LLM-Lean agent loop resolves 9 open Erdős math problems and proves 44 OEIS conjectures

Each proof cost a few hundred dollars to generate

7
Original post

Another 9 open Erdos problems solved, this time by DeepMind team. Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

11:29 PM · May 23, 2026 View on X
Reposted by

neurosymbolic by @swarat et al for the Erdos win, with much more careful, quantitative work than openai’s

in hindsight i wonder whether OpenAI rushed theirs out, knowing this was coming?

Przemek Chojecki | PCPrzemek Chojecki | PC@prz_chojecki

Another 9 open Erdos problems solved, this time by DeepMind team. Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

6:29 AM · May 24, 2026 · 576.7K Views
7:45 PM · May 24, 2026 · 12.1K Views

neurosymbolic by @swarat et al for the win.

unless this much more careful, quantitative work than openai’s

Przemek Chojecki | PCPrzemek Chojecki | PC@prz_chojecki

Another 9 open Erdos problems solved, this time by DeepMind team. Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

6:29 AM · May 24, 2026 · 576.7K Views
7:35 PM · May 24, 2026 · 1.9K Views

Tiling the lightcone of knowledge

MTSMTS@MTSlive

SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.

5:04 PM · May 24, 2026 · 2.1M Views
6:00 PM · May 24, 2026 · 14.1K Views

So what is the average social value (in $) per Erdos problem solved?

MTSMTS@MTSlive

SITUATION DETECTED: Google DeepMind’s AI agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.

5:04 PM · May 24, 2026 · 2.1M Views
5:59 PM · May 24, 2026 · 41.3K Views

@So8res Sci-fi has long been fantasies of sci being the center of power and status. It is in fact neither.

Nate Soares ⏹️Nate Soares ⏹️@So8res

In sci-fi books and movies, AI solving a bunch of math problems that stood open for decades would've been a big deal. Why isn't the mainstream media turning this into a bunch of sensational stories?

8:27 AM · May 25, 2026 · 10.1K Views
12:06 PM · May 25, 2026 · 1.2K Views

@robinhanson Tending to zero

Robin HansonRobin Hanson@robinhanson

So what is the average social value (in $) per Erdos problem solved?

5:59 PM · May 24, 2026 · 41.3K Views
6:54 PM · May 24, 2026 · 257 Views

Knowing quite a few of the Erdős epsilons personally, I think Erdős would have been thrilled.

11:39 PM · May 24, 2026 · 2.2K Views

The paper in general is worth reading, it's focused on areas where the lean ecosystem is more mature.

The Erdős problems are these:

Shubhendu TrivediShubhendu Trivedi@_onionesque

"Our most capable agent autonomously resolved 9 of 353 open Erdős problems at the per-problem cost of a few hundred dollars, proved 44/492 OEIS conjectures, and is being deployed in combinatorics, optimization, graph theory, algebraic geometry, and quantum optics research."

8:07 AM · May 24, 2026 · 9K Views
4:15 PM · May 24, 2026 · 306 Views

Amazing progress

Hunter📈🌈📊Hunter📈🌈📊@StatisticUrban

Google just solved 9/353 open Erdős Problems at the cost of a few hundred dollars each using its most capable LLM. The proofs were written in Lean and mechanically verified. This is no longer just olympiad mathematics.

5:45 PM · May 24, 2026 · 85.4K Views
6:23 PM · May 24, 2026 · 743 Views

Nine more Erdős problems have been solved.

This time, however, by Google DeepMind.

This shouldn't be underestimated, because on the one hand it increases competitive pressure, and on the other hand it proves that the other Frontier Labs can easily keep up.

Przemek Chojecki | PCPrzemek Chojecki | PC@prz_chojecki

Another 9 open Erdos problems solved, this time by DeepMind team. Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

6:29 AM · May 24, 2026 · 576.7K Views
6:36 PM · May 24, 2026 · 38.3K Views

Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems, some unsolved for 56 years, at a cost of a few hundred dollars per problem.

It also proved 44 open OEIS conjectures, resolved a 15-year-old question in algebraic geometry, and discovered a novel algorithmic parameter in optimization theory that humans hadn't found.

The core mechanism combines LLM reasoning (Gemini 3.1 Pro hype?!) with Lean formal verification. The AI generates proof attempts, Lean's compiler checks every logical step automatically. No human review needed to confirm correctness.

The most surprising finding: a basic agent that simply alternates LLM generation with compiler feedback replicated all 9 Erdős successes. The full-featured system with evolutionary search and reinforcement learning only provided meaningful advantages on the hardest problems.

This shows a more recent broader trend: as foundation models improve, simple agentic loops are catching up to complex specialized architectures . What sets this apart from OpenAI's informal proof approach: formal verification acts as an automatic filter. The failure analysis showed the AI frequently hallucinated lemmas it claimed were established results, and often disguised the core difficulty by rephrasing it as a helper lemma. Informal proofs would let these errors pass. Lean catches them immediately.

The agent also detected misformalizations in existing mathematical literature, correcting ambiguities in problem statements before solving the corrected versions. It served as both a solver and a diagnostic tool.

Current limitations are real. Successes cluster in combinatorics, number theory, and optimization where Lean's math library is mature. Problems requiring substantial new theory remain out of reach. Most Erdős problems still weren't solved tho.

10:17 PM · May 24, 2026 · 34.5K Views

Paper: https://arxiv.org/html/2605.22763v1

Chubby♨️Chubby♨️@kimmonismus

Google DeepMind's AlphaProof Nexus autonomously solved 9 open Erdős problems, some unsolved for 56 years, at a cost of a few hundred dollars per problem. It also proved 44 open OEIS conjectures, resolved a 15-year-old question in algebraic geometry, and discovered a novel algorithmic parameter in optimization theory that humans hadn't found. The core mechanism combines LLM reasoning (Gemini 3.1 Pro hype?!) with Lean formal verification. The AI generates proof attempts, Lean's compiler checks every logical step automatically. No human review needed to confirm correctness. The most surprising finding: a basic agent that simply alternates LLM generation with compiler feedback replicated all 9 Erdős successes. The full-featured system with evolutionary search and reinforcement learning only provided meaningful advantages on the hardest problems. This shows a more recent broader trend: as foundation models improve, simple agentic loops are catching up to complex specialized architectures . What sets this apart from OpenAI's informal proof approach: formal verification acts as an automatic filter. The failure analysis showed the AI frequently hallucinated lemmas it claimed were established results, and often disguised the core difficulty by rephrasing it as a helper lemma. Informal proofs would let these errors pass. Lean catches them immediately. The agent also detected misformalizations in existing mathematical literature, correcting ambiguities in problem statements before solving the corrected versions. It served as both a solver and a diagnostic tool. Current limitations are real. Successes cluster in combinatorics, number theory, and optimization where Lean's math library is mature. Problems requiring substantial new theory remain out of reach. Most Erdős problems still weren't solved tho.

10:17 PM · May 24, 2026 · 34.5K Views
10:17 PM · May 24, 2026 · 5.1K Views

You know we hit AGI when the AI solves problems that the vast majority cannot remotely grok. Furthermore, I doubt anybody without a long formal education can grok what the hell was just solved. So those advocating for the end of higher education and research have a Dunning-Kruger syndrome.

Przemek Chojecki | PCPrzemek Chojecki | PC@prz_chojecki

Another 9 open Erdos problems solved, this time by DeepMind team. Interesting loop of LLM - Lean agents working autonomously, and only after it's verified formally, going through human review.

6:29 AM · May 24, 2026 · 576.7K Views
2:34 PM · May 24, 2026 · 2.5K Views

In sci-fi books and movies, AI solving a bunch of math problems that stood open for decades would've been a big deal. Why isn't the mainstream media turning this into a bunch of sensational stories?

AI Notkilleveryoneism Memes ⏸️AI Notkilleveryoneism Memes ⏸️@AISafetyMemes

I'm old enough to remember when everyone thought AI solving ONE novel math problem would be a front page story around the world Today, AI solved not one, but NINE open problems - some 50 years old. AND proved ***44*** out of 492 open OEIS conjectures. Zero media coverage.

6:06 PM · May 24, 2026 · 127K Views
8:27 AM · May 25, 2026 · 10.1K Views

(It's because humanity is sleepwalking into the creation of machine superintelligence. Once you realize that, it's easier to realize that we're also sleepwalking into a whole lot of danger.)

Nate Soares ⏹️Nate Soares ⏹️@So8res

In sci-fi books and movies, AI solving a bunch of math problems that stood open for decades would've been a big deal. Why isn't the mainstream media turning this into a bunch of sensational stories?

8:27 AM · May 25, 2026 · 10.1K Views
8:27 AM · May 25, 2026 · 1.2K Views
Google DeepMind's LLM-Lean agent loop resolves 9 open Erdős math problems and proves 44 OEIS conjectures · Digg