/Tech1d ago

Anthropic releases Claude Fable 5, topping SWE-bench but struggling with one-shot design and technical documentation

Ethan Mollick built a data tool in 9.5 hours.

1281.8K80504204.7K
Original post
Alex Volkov@altryne#1378inTech

@karpathy Andrej do you still think that humans reading code will be a requirement for production at the end of this year?

http://thursdai.news/zl

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

11:13 AM · Jun 9, 2026 · 1.1K Views
Sentiment

Positive users praised Claude Fable 5 reviews as helpful and expressed excitement about its capabilities, while negative users criticized its frequent refusals of ambitious tasks and questioned testing practices.

Pos
50.0%
Neg
50.0%
11 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS81.6K

Fable 5 (aka baby Mythos) just dropped. Is it as scary (or scary good) as they claim?

My thoughts after some early testing: - smart smart smart (crushed SWE bench) - but do you always need hyper intelligence? - faceplanted on one-shot design in a way that shocked me - i'm not sure about dynamic workflows + complex subagents. they work, but at what cost? - def knocked out technical work well - ootb bad at making technical docs + specs for humans. probably really good docs for agents. but nearly impossible to parse prose. - A++ vision and document formatting. this was my favorite part

NOT a daily driver, wouldn't put this model in a meeting, but def will keep it back in the server rack, churning out code.

Full take on YT: https://www.youtube.com/watch?v=IREnr4I89Ho

Claude@claudeai

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.

Its capabilities exceed those of any model we’ve ever made generally available.

1dViews 81.6KLikes 247Bookmarks 153
BOOKMARKS231RETWEETS52REPLIES60

Wrote up my initial impressions of Claude Fable 5 - it has a big model smell: slow, expensive and capable of crunching through pretty much everything I threw at it https://simonwillison.net/2026/Jun/9/claude-fable-5/

23hViews 42KLikes 527Bookmarks 231
LIKES703
snow@snowclipsed

>You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

i am sorry, but it literally is refusing half of these.

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

1dViews 50.6KLikes 703Bookmarks 107
Hensen Juang@basedjensen

Unfortunately even karpathy can't save anthropic bros from this hole they dug themselves in

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

19hViews 40KLikes 385Bookmarks 28
Jake@JakeKAllDay

@snowclipsed also, respectfully, Andrej, who the fuck is using a frontier model at $50/mtok out to make html artifacts?

23hViews 263Likes 6Bookmarks 4
Shannon Sands@max_paperclips

@snowclipsed his own autoresearch project would be refused

21hViews 709Likes 30
snow@snowclipsed

@max_paperclips (from a friend) it refused to thoroughly look through the nanogpt speedrun library, basically a starting point for a lot of optimizer autoresearch

21hViews 653Likes 16

Here are Fable's pelicans for the different thinking effort levels, plus how much each one cost to generate via the Claude API

23hViews 1.5KLikes 9Bookmarks 1

@OrganicGPT Posted about that here https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/

22hViews 395Likes 6
snow@snowclipsed

@xXshaurizardXx nope, memory is disabled for me. it flat out refuses anything ML engineer aligned, or even anything sufficiently advanced or over-casually phrased to it.

1dViews 265Likes 7
Akhil Ivaturi@GordianKnot256

@snowclipsed Tbf, he never said Fable would answer any of that, just that you can ask it.

1dViews 144Likes 4

@clairevo This is really helpful. I have til the 22nd to use this and it eats up my usage. I have to prioritize and will use it to solve a couple lingering buggies.

1dViews 36Likes 1Bookmarks 1
Nick Ducoff@nickducoff

@clairevo This was a great overview, thanks Claire!

1dViews 352Likes 3
shaur@xXshaurizardXx

@snowclipsed is this a memory thing

1dViews 305Likes 2
Conor Bronsdon@ConorBronsdon

@clairevo Re subagents: I recommend having Fable orchestrate a bunch of Sonnet/Opus sub agents to cut token costs.

Fable is excellent at orchestrating sub agents and this significantly reduces your overall costs while still getting most of the benefits of the ultra-smart lead model.

1dViews 154Likes 2
Greyson MacAlpine@GreysonSofia

@clairevo Queued the full YT take to watch this afternoon! 🚀

1dViews 185Likes 1
Behnam@OrganicGPT

@simonw Simon. you are influential in AI. Please push back against the guardrails that limit LLM research. Anthropoic shouldn't be allowed to stop other researchers from developing new AI models using Claude. It's a sad day...

22hViews 368

@ConorBronsdon Yeah but the subagents just aren’t adding a lot to the quality of output. What are your favorite use cases

1dViews 112Likes 1
Sojunky@sojunky

@clairevo Fable apparently thinks oncology genes like KRAS and EGFR pose a "biology" risk --> punted me back to 4.8!

1dViews 104Likes 1
sarang@EsPeeKid

@clairevo been patiently waiting for the day of review!

1dViews 100Likes 1
Load more posts