/AI3h ago

Anthropic releases Claude Fable 5, topping SWE-bench but struggling with one-shot design and technical documentation

Ethan Mollick built a data tool in 9.5 hours.

2012367433.1K

#1184

Original post

Alex Volkov@altryne#1245inAI

@karpathy Andrej do you still think that humans reading code will be a requirement for production at the end of this year?

http://thursdai.news/zl

Andrej Karpathy@karpathy

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

11:13 AM · Jun 9, 2026 · 934 Views

/AI3h ago

Anthropic releases Claude Fable 5, topping SWE-bench but struggling with one-shot design and technical documentation

Ethan Mollick built a data tool in 9.5 hours.

2012367433.1K

#1184

Original post

Alex Volkov@altryne#1245inAI

@karpathy Andrej do you still think that humans reading code will be a requirement for production at the end of this year?

http://thursdai.news/zl

Andrej Karpathy@karpathy

11:13 AM · Jun 9, 2026 · 934 Views

Sentiment

Positive users are excited about Claude Fable 5's SOTA benchmarks and qualitative leaps on tasks like SWE-Bench, while negative users criticize its high token use, flawed design outputs, and incorrect biology risk flagging.

Pos

57.1%

Neg

42.9%

7 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS33.8KBOOKMARKS76LIKES128RETWEETS6REPLIES18

claire vo 🖤@clairevo

Fable 5 (aka baby Mythos) just dropped. Is it as scary (or scary good) as they claim?

My thoughts after some early testing: - smart smart smart (crushed SWE bench) - but do you always need hyper intelligence? - faceplanted on one-shot design in a way that shocked me - i'm not sure about dynamic workflows + complex subagents. they work, but at what cost? - def knocked out technical work well - ootb bad at making technical docs + specs for humans. probably really good docs for agents. but nearly impossible to parse prose. - A++ vision and document formatting. this was my favorite part

NOT a daily driver, wouldn't put this model in a meeting, but def will keep it back in the server rack, churning out code.

Full take on YT: https://www.youtube.com/watch?v=IREnr4I89Ho

Claude@claudeai

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.

Its capabilities exceed those of any model we’ve ever made generally available.

3h33.8K12876

snow@snowclipsed

>You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

i am sorry, but it literally is refusing half of these.

Andrej Karpathy@karpathy

1h1.6K512

Built by APE | Austin Product Engineering@Built_by_APE

@clairevo This is really helpful. I have til the 22nd to use this and it eats up my usage. I have to prioritize and will use it to solve a couple lingering buggies.

2h3611

Conor Bronsdon@ConorBronsdon

@clairevo Re subagents: I recommend having Fable orchestrate a bunch of Sonnet/Opus sub agents to cut token costs.

Fable is excellent at orchestrating sub agents and this significantly reduces your overall costs while still getting most of the benefits of the ultra-smart lead model.

2h1542

Greyson MacAlpine@GreysonSofia

@clairevo Queued the full YT take to watch this afternoon! 🚀

2h1851

claire vo 🖤@clairevo

@ConorBronsdon Yeah but the subagents just aren’t adding a lot to the quality of output. What are your favorite use cases

2h1121

Sojunky@sojunky

@clairevo Fable apparently thinks oncology genes like KRAS and EGFR pose a "biology" risk --> punted me back to 4.8!

2h1041

sarang@EsPeeKid

@clairevo been patiently waiting for the day of review!

2h1001

Bachi@julibachi

@clairevo baby mythos lol

2h551

Siddharth Jaiswal@sdrth

@clairevo was the deck in the video made by Fable?

2h151

Alex McD@alexqmcd

@clairevo surprising to see those horrid design results. we've been getting pretty solid artifacts, eg: https://hyperagent.com/s/SwNlqxNVOhxSfr9qle_-dQ

hard agree about the density of its writing, still reaching for Sonnet there

2h146

Salma@Salmaaboukarr

@clairevo this is going to be good!

2h1.3K1

Olamide@PM_Mide

@clairevo Incredibly smart. But not as scary as we thought it'd be

2h1172

claire vo 🖤@clairevo

@sdrth Hah I temp didn’t have access so it was good ol opus

2h1321

claire vo 🖤@clairevo

@alexqmcd It was prompted w a pretty technical spec and I just wonder if it over rotated

2h1201

C.NASIR@CNASIR2_0

@clairevo I thought I would be scared. I’m just excited.

2h871

claire vo 🖤@clairevo

@EsPeeKid i got u!

2h641

claire vo 🖤@clairevo

@julibachi Safe and sound

2h441

claire vo 🖤@clairevo

@GreysonSofia Lmk what you think of what I think

2h119

claire vo 🖤@clairevo

@sojunky

2h88