/Tech3h ago

Fable 5 and GPT 5.5 top updated ArXivMath and BrokenArXiv benchmarks, but critic warns of excessive token consumption

Teortaxes noted the models missed expected WeirdML efficiency profiles.

68992617.6K

#501

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

but honestly, this isn't the WeirdML profile it clearly still sucks in this. Too many tokens

Tim@TimGMath

Results for BrokenArXiv:

12:09 PM · Jun 11, 2026 · 1.5K Views

/Tech3h ago

Fable 5 and GPT 5.5 top updated ArXivMath and BrokenArXiv benchmarks, but critic warns of excessive token consumption

Teortaxes noted the models missed expected WeirdML efficiency profiles.

68992617.6K

#501

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

but honestly, this isn't the WeirdML profile it clearly still sucks in this. Too many tokens

Tim@TimGMath

Results for BrokenArXiv:

12:09 PM · Jun 11, 2026 · 1.5K Views

Sentiment

Some users praise MathArena for running thorough benchmarks like ArXivMath despite high costs limiting GPT Pro participation from other organizations.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS1.5K

Tim@TimGMath

Results for BrokenArXiv:

7h1.5K51

BOOKMARKS1REPLIES1

Tim@TimGMath

Further: Fable 5 is less expensive than Opus 4.8 on ArXivMath, since it uses fewer tokens. Further, Gemini-3.1-Pro scores quite poor this month, with DeepSeek-v4-Flash outperforming it.

7h38831

LIKES9

Tim@TimGMath

Despite its impressive performance, Fable 5 is much more expensive than GPT 5.5 and requires a comparison with GPT-5.5-Pro for an accurate evaluation of its capabilities, but we can currently not make this comparison due to the costs of GPT-5.5-Pro.

7h42591

RETWEETS9

Tim@TimGMath

The latest versions of ArXivMath and BrokenArXiv have been released! Impressive Performance of Fable 5, which takes the top spot on ArXivMath. On BrokenArXiv, GPT 5.5 continues to be in the lead.

7h16.5K8526

Tim@TimGMath

Full results: http://matharena.ai

7h3072

Samian Noesis@samiannoesis

@TimGMath @Liam06972452 Incredible! Is broken arxiv math an alternative version of the same problems or a different proposal altogether?

5h144

Florian Brand@xeophon

@TimGMath have you approached openai for credits for gpt pro? i love matharena and so few orgs run gpt pro due to costs, painting an incomplete picture :( maybe @reach_vb could help you find the correct person for grants?

2h3