/AI2h ago

Anthropic's Claude Mythos 5 Leads GDP.pdf and RiemannBench AI Benchmarks

213222.3K
Original post
echen@echen#1659inAI

GDP.pdf measures whether models can read the messy professional documents - wiring diagrams, rocket schematics - that run the world.

Riemann-bench measures research-level math, written by ivy league profs and IMO medalists in the course of their work.

...and climbing them both?...

the stuff of fables 馃槑

congrats anthropic!

12:19 PM 路 Jun 9, 2026 路 1.8K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS500LIKES1
echen@echen

https://surgehq.ai/leaderboards/gdp-pdf https://surgehq.ai/leaderboards/riemann-bench

echen@echen

GDP.pdf measures whether models can read the messy professional documents - wiring diagrams, rocket schematics - that run the world.

Riemann-bench measures research-level math, written by ivy league profs and IMO medalists in the course of their work.

...and climbing them both?...

the stuff of fables 馃槑

congrats anthropic!

2hViews 500Likes 1Bookmarks 0