Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.
Anthropic's Claude Fable 5 scores 88% on FrontierMath Tier 4, outperforming OpenAI on the benchmark for the first time
Story Overview
Anthropic's Claude Fable 5, the consumer version of its Mythos-class model released around June 9-10, posted 87 percent on FrontierMath Tiers 1-3 and 88 percent on Tier 4. Epoch AI Research flagged the results, noting this is the first time an Anthropic model has led OpenAI on the Tier 4 problems, which are authored by expert mathematicians and kept unpublished.
Rapid gains keep beating internal forecasts
The Tier 4 result cleared even bullish mid-year predictions by roughly 25 points, extending a streak of steady math-reasoning lifts across recent Claude releases.
One number hints at broader capability
The FrontierMath performance implies an Epoch Capabilities Index of 164, though the exact composite figure for this model variant remains an inference rather than a separately published total.
Positive users praise Claude Fable 5's FrontierMath benchmark wins over OpenAI as evidence of fast reasoning gains, while negative users dismiss the results as irrelevant shilling or Anthropic hype that ignores real-world needs.
Most Activity
The shape of the graph is getting very familiar.
Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.
Looking at the graph, I think Fable 5 will only maintain its lead up to GPT-5.6.
And secondly, I think the benchmark will soon be completely saturated.
Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.
Anthropic is currently just scale-mogging everyone
scaling just works and will continue to work
Fable 5 scored 88% on FrontierMath Tier 4 this implies an ECI score of 164
it's also the first time Anthropic has a model that is better at math than OpenAI
Fable 5 scored 88% on FrontierMath Tier 4 this implies an ECI score of 164
it's also the first time Anthropic has a model that is better at math than OpenAI
Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.
while it is consistent with my predictions on paper, this still gives me an astonishing amount of future shock
Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.
We're ~halfway through the year, and Fable has beaten my forecast (which was above the median forecast!) for FrontierMath Tier 4 by ... 25 points!
Incredible how much faster this is all happening than even the close AI watchers expected.
Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.
Co-Mathematician has retained its position relative to other models, save Fable which has taken the lead. I don't think Fable will stay there for long.
Claude Fable 5 result for FrontierMath T4 has just come in and it is vastly SoTA.

These are the highest scores among models we have run on the recently-released v2 dataset, though our runs of GPT Pro models are on-going. Find all scores on our website.
https://epoch.ai/frontiermath/tiers-1-4

@EpochAIResearch I've got a feeling they won't be leading this particular one for very long.
numbers be high
Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.

@kimmonismus New benchmarks for newer models until humans can’t create benchmarks 🤪

@gh0stpen dario saved us all

@scaling01 Fable is indeed amazing. However, they haven't benchmarked 5.5 Pro which should be definitely better than regular 5.5. Not to mention 5.6 Pro. I suspect within two weeks OpenAI will have the crown again in math.

@ypsehlig both

@iruletheworldmo Is this with or without the lobotomy?

@AndrewCurran_ @EpochAIResearch Is the feeling based on seeing gpt 5.6 leaks?

@emollick It’s a wall. Gary was right.

@EpochAIResearch @alexalbert__ I am yet to find a use case where you need more juice than 5.5

@iruletheworldmo we are genuinely so back and it's not even funny (it's hilarious)

@DDhfrcycyf5vuf @scaling01 I am confident they don't because in an earlier post they said they will soon test Fable and GPT 5.5-Pro (which obviously now have done with Fable).