/Tech1d ago

Cognition launches FrontierCode benchmark as Anthropic's Claude Fable 5 outperforms GPT 5.5 on agentic coding

AI Judge changed title after evaluation, original title: "Claude Fable 5 launches, setting a record 72.9% on CursorBench and scoring 80.3% on SWE-Bench Pro"

Claude Fable 5 scored 80.3% on SWE-Bench Pro.

5.1K79K4.4K14.1K11.3M

Original post

Andrew Curran@AndrewCurran_#436inTech

Andrew Curran@AndrewCurran_

10:07 AM · Jun 9, 2026 · 729 Views

/Tech1d ago

Cognition launches FrontierCode benchmark as Anthropic's Claude Fable 5 outperforms GPT 5.5 on agentic coding

AI Judge changed title after evaluation, original title: "Claude Fable 5 launches, setting a record 72.9% on CursorBench and scoring 80.3% on SWE-Bench Pro"

Claude Fable 5 scored 80.3% on SWE-Bench Pro.

5.1K79K4.4K14.1K11.3M

Original post

Andrew Curran@AndrewCurran_#436inTech

Andrew Curran@AndrewCurran_

10:07 AM · Jun 9, 2026 · 729 Views

Sentiment

Positive users praise Claude Fable 5's record benchmark scores in coding and research while negative users complain about high costs, hype, and usability problems.

Pos

54.5%

Neg

45.5%

438 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS2.2MBOOKMARKS5.9KLIKES23.4KRETWEETS2.2KREPLIES1.1K

Andrej Karpathy@karpathy

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

Claude@claudeai

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision.

The longer and more complex the task, the larger Fable 5’s lead over our other models.

23h2.2M23.4K5.9K

Deedy@deedydas

Claude Fable 5 is by far the most ridiculous model that makes me genuinely afraid for the future of software engineering.

I compiled the top 10 most unbelievable things I've seen Claude Fable 5 do today:

— Migrate a 50M line codebase from Stripe in a day (humans take 2mos) — Draw amazing 3D graphics a) Boeing 747 b) space simulations with >5000 objects c) Minecraft roller coasters d) full photorealistic forest scenes e) NYC skyline f) stormy clouds) — One-shot Pokemon FireRed the game — Optimize a real world proprietary interaction net evaluator 10x more than the next best model, gpt5.5

AND it's about the same price as GPT 5.5 ($10/M input, $45/M output) vs Fable 5 ($10/M input, $50/M output) and 6x cheaper than GPT 5.5 Pro.

16h689.5K5.2K2.8K

Cursor@cursor_ai

Claude Fable 5 is now available in Cursor.

It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best.

1d1.1M5.8K633

Chubby♨️@kimmonismus

Claude 5 Fable tl;dr

- It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research

-The longer and more complex the task, the larger Fable 5’s lead over our other models

-its more token-efficient than past Claude models

- Fable 5 stays focused across millions of tokens in long-running tasks and improves its outputs using its own notes

Fable 5 is more than just better benchmarks. It's more efficient, allows for longer work periods, offers better context management, and so much more.

GPT-5.6 is just around the corner.

I'm a huge Codex fan, but Fable/Mythos is in a league of its own. I'm curious to see if OpenAI will release its own Mythos.

"During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand."

Chubby♨️@kimmonismus

Claude 5 Fable Benchmarks!

Holy moly, significant jump even to Mythos

1d560.7K1.8K449

signüll@signulll

this is absolutely incredible.

1d219.5K2.1K301

driss guessous@drisspg

Holy chart crime

Cursor@cursor_ai

Claude Fable 5 is now available in Cursor.

It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best.

23h223.6K1.4K79

Lisan al Gaib@scaling01

I never thought we would get another GPT-4 moment

20h70.1K1.1K78

Chubby♨️@kimmonismus

It's already June 9th, and Gemini 3.5 Pro and GPT-5.6 are nearing release (Google even already announced 3.5 Pro during i/o)

Rumor has it that GPT-5.6 will be released as early as next week.

So far, it's safe to say that - guardrails aside - Anthropic is truly the frontier lab that's entering a new league with Mythos/Fable.

Gemini 3.5 Pro and GPT-5.6 have a lot to deliver and are now under pressure.

This release has certainly boosted Anthropic's upcoming IPO. Anthropic has proven that they are still capable of making significant leaps in performance and efficiency. There's no end in sight.

But the pressure on the competition is mounting.

And remember that Claude Mythos was (and probably is) still leader in Long Horizon software Tasks

Chubby♨️@kimmonismus

Claude 5 Fable tl;dr

- It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research

-The longer and more complex the task, the larger Fable 5’s lead over our other models

-its more token-efficient than past Claude models

- Fable 5 stays focused across millions of tokens in long-running tasks and improves its outputs using its own notes

Fable 5 is more than just better benchmarks. It's more efficient, allows for longer work periods, offers better context management, and so much more.

GPT-5.6 is just around the corner.

I'm a huge Codex fan, but Fable/Mythos is in a league of its own. I'm curious to see if OpenAI will release its own Mythos.

19h93.4K693112

Chubby♨️@kimmonismus

The guardrails are way too strict. Even the simplest questions get cut off immediately.

And it's only on the schedule until June 22nd.

Damn, Anthropic really thinks the model is too powerful.

Chubby♨️@kimmonismus

Claude 5 Fable tl;dr

- It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research

-The longer and more complex the task, the larger Fable 5’s lead over our other models

-its more token-efficient than past Claude models

- Fable 5 stays focused across millions of tokens in long-running tasks and improves its outputs using its own notes

Fable 5 is more than just better benchmarks. It's more efficient, allows for longer work periods, offers better context management, and so much more.

GPT-5.6 is just around the corner.

I'm a huge Codex fan, but Fable/Mythos is in a league of its own. I'm curious to see if OpenAI will release its own Mythos.

23h136.9K67883

claire vo 🖤@clairevo

Fable 5 (aka baby Mythos) just dropped. Is it as scary (or scary good) as they claim?

My thoughts after some early testing: - smart smart smart (crushed SWE bench) - but do you always need hyper intelligence? - faceplanted on one-shot design in a way that shocked me - i'm not sure about dynamic workflows + complex subagents. they work, but at what cost? - def knocked out technical work well - ootb bad at making technical docs + specs for humans. probably really good docs for agents. but nearly impossible to parse prose. - A++ vision and document formatting. this was my favorite part

NOT a daily driver, wouldn't put this model in a meeting, but def will keep it back in the server rack, churning out code.

Full take on YT: https://www.youtube.com/watch?v=IREnr4I89Ho

Claude@claudeai

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.

Its capabilities exceed those of any model we’ve ever made generally available.

22h80.8K242153

Liv@livgorton

I joined anthropic a ~month ago and have written ~no code myself. I went from getting quite frustrated with coding agents even 6 months ago and giving up and writing some of the code myself to a big part of my role now being agent management.

Nat McAleese@__nmca__

fable (well, mythos) has been transformational to my day to day work. I always felt Opus 4.5 could barely code; 4.6 was just-about-useful, but I have barely written a line of code since fable.

23h38.7K455128

Bindu Reddy@bindureddy

Results from Internal Coding Evals For Claude Fable

- For 98% of tasks, it simply does the same thing as GPT 5.5 or Opus 4.8 and costs 2x

- For 2% of hard coding tasks, it does make sense if you are willing to pay double and get some quality gains

So ideally, you want to ROUTE VERY hard tasks to Fable

22h49K53391

taoki@justalexoki

Claude@claudeai

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision.

The longer and more complex the task, the larger Fable 5’s lead over our other models.

1d363K14.8K575

Yuchen Jin@Yuchenj_UW

Claude Fable 5 / Mythos 5 wins everywhere.

I thought Fable 5 was just a nerfed Mythos Preview, but it’s literally better. SWE-Bench Pro: Fable 5: 80.3%, GPT-5.5: 58.6%.

And the price is only 2x Opus 4.8: $10/input MTok, $50/output MTok.

I don't think GPT 5.6 can beat this...

1d39.7K48947

Lisan al Gaib@scaling01

you're totally right open-source is going to catch up in 4 months

18h64K51447

eric zakariasson@ericzakariasson

go try out fable in cursor, it's an incredible but expensive model!

Cursor@cursor_ai

Claude Fable 5 is now available in Cursor.

It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best.

1d48K50632

Lisan al Gaib@scaling01

Anthropic has a coding MOAT

Nat McAleese@__nmca__

welcome to the world, Claude Fable 5!

1d32.4K51139

Claude@claudeai

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision.

The longer and more complex the task, the larger Fable 5’s lead over our other models.

1d4.8M15K2.3K

Daniel@growing_daniel

Fire everyone now

23h39.4K40930

Julian Schrittwieser@Mononofu

I’m incredibly excited that Fable is now available for everyone! I’ve been blown away by how smart it is - it one-shots entire PRs for me, finds obscure bugs and has written all my code since I started using it.

Claude@claudeai

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.

Its capabilities exceed those of any model we’ve ever made generally available.

17h43.5K28741