over the weekend i had another obvious thing to check, namely whether claude autonomously resolves the famed sum-product conjecture over the reals. answer: yes
Anthropic's Levent Alpöge says Claude autonomously resolved the sum-product conjecture over the real numbers
Mark Chen linked autonomous theorem-proving to finding cyber vulnerabilities.
Many users called Claude's autonomous resolution of the sum-product conjecture over the reals exciting and impressive for showing strong math performance, while others dismissed the result as overhyped or questioned the model's reliability.
Most Activity
When Mythos came out, my immediate thought was "if our models can prove 80-year-old theorems, surely they can find cyber vulnerabilities too." And they did.
I imagine the researchers there are thinking the same thought in reverse.
over the weekend i had another obvious thing to check, namely whether claude autonomously resolves the famed sum-product conjecture over the reals. answer: yes
over the weekend i had another obvious thing to check, namely whether claude autonomously resolves the famed sum-product conjecture over the reals. answer: yes

This is very cool. Is this mythos powering the main system?
Are you going to try running some still open conjectures of similar difficulty. It seems likely you could resolve some of them.
Personally I would love to see progress on Seymour's second neighborhood conjecture. This was one of the first open problems I ever tried to work on as a baby faced 17 year old fresh out of high school. I didn't make any progress but I still had fun trying things.

@markchen90 I think this underlines an often underscored part of 'AI research' --- when we have a model we are not sure what its capabilities are. discovering this and advancing something along the way is an independent contribution, even if it then can be replicated by someone else

@__alpoge__ I don't really understand how many open conjectures I should expect to be resolved if you ran mythos on 10,000 of them. You're framing it as if you keep checking one and finding claude can solve it; have you actually tried thousands? Why *not* run it on 10,000 open problems?

this conjecture has been central in combinatorics and its applications for quite some time, and honestly i am still pretty amazed it was disproved last week by Bloom, Sawin, Schildkraut, and Zhelezov, their setup having lots in common with claude’s simple disproof of the unit distance conjecture. actually i want to note they got to their construction independently over that same weekend, and indeed it was with Sawin’s blessing that i ran this quick test.

here is opus 4.8’s commentary on that attempt, while also caricaturing my requested style a bit…:) https://www-cdn.anthropic.com/files/4zrzovbb/website/6c28f488d07644c212e352014108c8ed23cc7f76.pdf
(notice, by the way, that this features another construction ruling out collisions. and an easter egg irrelevant factor of 2…)

@markchen90 yea

@__alpoge__ @_sholtodouglas You should publish the prompts /drafter /harness because as we know, a lot depends on how you instruct a model and what the environment is

here i think claude again found an overlooked very simple path, namely it chooses its usual box to be of elements congruent to 1 modulo a parameter slightly larger than the square of the heights of the units, so that a collision would mean a ratio of units congruent to 1 modulo an ideal of norm so large that the congruence would upgrade to equality. (i would call this using p-adic rather than archimedean repulsion.)

@__alpoge__ any chance you can share the harness you are using?
@markchen90 Exactly
When Mythos came out, my immediate thought was "if our models can prove 80-year-old theorems, surely they can find cyber vulnerabilities too." And they did.
I imagine the researchers there are thinking the same thought in reverse.

@__alpoge__ @grok what is the sum product conjecture?

as before i dropped the problem into the harness, blocked internet access, and made sure no information about their construction leaked to the claude code instance. i suppose it’s no surprise it quickly arrived at the same setting as it did in the unit distance conjecture —- real quadratic field with sufficiently divisible discriminant, pass to 2-class-field tower, consider a unit box and a usual box, and now combine them via multiplication rather than via addition as in the unit distance argument.

@markchen90 GPT 5.6 when

the point is then rather simple and has long been understood after Balog-Wooley, namely that one obtains a power savings when multiplying the set with itself from the same for the multiplicative part, whereas one obtains a power savings when adding the set to itself from the same for a dilate of the additive part.

this time the model did not need to come up with the trick of adjoining a fourth root of the discriminant because it no longer needed lots of complex places to produce relative units and thus unit distances, but it did need a trick to ensure the product of the unit box and the usual box had the expected size, i.e. to rule out collisions.

in the case of the Bloom-Sawin-Schildkraut-Zhelezov argument this is the technical complication, and to overcome it they use an annulus construction and an absolute repulsion of units away from 1, via a classic Mahler measure lower bound of Schinzel or via a cute argument suggested to the authors by GPT-5.5 Pro.

i also wanted to make sure people are calibrated on these capabilities. it seems to be expected that such checking involves drowning in incorrect proofs, and indeed this may have occurred with such exercises in the past. instead i wait a bit and eventually something gets past claude’s refereeing (last time all it took was a few hours very early Saturday morning), and so far the output of that process has been a correct, very simple proof. actually i even simplify it further for myself, by taking that output and asking the model to rewrite its writeup in my style, which adds a bit of comedy to the endeavour.

one’s gut reaction is worry in imposing such an onerous congruence, but the point is that one can just combine a very small box of units with a much larger box and still win, since one is just trying to save _some_ power, and all radii in sight are constants.