Fable achieved a significant breakthrough in one of our open problems. This is a problem where ChatGPT 5.5 could not even begin anything useful. The breakthrough seems legit (although not 100% checked yet), and Fable even claims to have a full solution. >10 hours total runtime so far. A 30 page document with the proofs of some lemmas not yet spelled out. We can not yet know whether Fable indeed has solved it, but even if it is just a partial solution, we are absolutely amazed. More details will follow, and once we are at the end of the story, I will also write a full substack post. Collaboration with István Vona, a postdoc in my group.
AI system Fable generates 30-page partial proof of unsolved math problem, succeeding where ChatGPT 5.5 failed
The results, developed with postdoc István Vona, await verification
Positive users highlight Fable's extended runtime as a breakthrough unlocking open math problems beyond GPT-5.5, while negative users dismiss the claims as clickbait slop or hype.
Most Activity
Dario may be very well calibrated with his "millions of chips is all that matters to beat Chyna" posture. Yeah it'll often be less efficient. Does that truly matter for tasks you can't solve by paying *any* amount of money, because there is no more qualified labor on the market?
Fable achieved a significant breakthrough in one of our open problems. This is a problem where ChatGPT 5.5 could not even begin anything useful. The breakthrough seems legit (although not 100% checked yet), and Fable even claims to have a full solution. >10 hours total runtime so far. A 30 page document with the proofs of some lemmas not yet spelled out. We can not yet know whether Fable indeed has solved it, but even if it is just a partial solution, we are absolutely amazed. More details will follow, and once we are at the end of the story, I will also write a full substack post. Collaboration with István Vona, a postdoc in my group.

@pozsgaybalazs Raw runtime is the new benchmark. Moving to a 10 hour continuous reasoning path changes everything. Even with holes, 30 pages of lemmas is audit ready.

@pozsgaybalazs By GPT 5.5 do you mean GPT 5.5 Pro or the regular GPT 5.5 xHigh ?

@pozsgaybalazs mythos is misaligned so no paper is accepted.
Give us the @leanprover project

@pozsgaybalazs I hear you got some slop from the slop farm?

@pozsgaybalazs @roydanroy Why frame it like that? The language around the alternative charges this where it should be objective.

@pozsgaybalazs Please post a screenshot of an actual response from Fable 5. Any random response.

@pozsgaybalazs How do you work with it? Put something like /goal in claude code and ask it to do a paper, solve a problem? How can you make it think for so long otherwise?

Can we stop with this argument? The engineers and researchers at Anthropic are elite and they're absolutely on par with the researchers at OpenAI. This was never about chips or compute. If it were, Google which has the best chips, the largest compute infrastructure, and the best data on the planet wouldn't be struggling. Same applies to xAI.

@pozsgaybalazs Did it actually spit out an answer or just give you its CoT?

@pozsgaybalazs @roydanroy Hey, please keep us updated?

@pozsgaybalazs Be aware it’s not a model anymore but an agentic system. You can’t compare it to a regular model. You would need a harness.

@pozsgaybalazs terao’s?

@chewkokwah @pozsgaybalazs isnt gpt5pro is a cot/tot harness on top of 5.5 mostly? i hope if they r in a hurry to solve math problems they are not just trying with a thin api harness but have created one. or tbh just use Cursor

@pozsgaybalazs Fable deez nuts nigga

@pozsgaybalazs Prove it. Show the data. This post evidences nothing besides clickbait. Prove. It.

@pozsgaybalazs what's the verifier here because 10 hours of search can manufacture polished false positives

@based_buffalo69 Someone has a chip on their shoulder. I can smell the resentment through the screen. You know deep down this isnt an example of that in this case, yet you feel so threatened by it that you have to denigrate it without knowing anything about this particular case.

@pozsgaybalazs Guys they listened to me. It works. Thanks Dario.

The >10h runtime detail is what stands out most. We've been so fixated on fast inference that we've barely explored what sustained computation unlocks. If even partial progress on an open problem is achievable at this scale, the real bottleneck shifts to verification — mathematicians reviewing AI-generated proofs. What's the domain?