/Tech1d ago

Miles Brundage disputes claims that Anthropic's internal Mythos model has triggered a massive boost to developer productivity

Story Overview

Miles Brundage is pushing back on reports that Anthropic's internal Mythos model has delivered a dramatic leap in developer output, framing the system instead as a routine new pretraining effort on par with OpenAI's earlier Spud run and unlikely to confer any unique acceleration edge.

1873.2K99437337.9K
Original post
Andrew Curran@AndrewCurran_#535inTech

The internal boost from Mythos-assisted development since February is just too big. Anthropic is pulling away from the pack for the first time, and at the same time they are also speeding up. The race legitimately feels like it is changing for the first time in years.

11:00 AM · Jun 9, 2026 · 97.5K Views
Open Question

Brundage's take on the pretrain cycle

He notes Mythos follows the same pattern as prior base models and expects the upcoming 4.6 release to deliver comparable gains without evidence of outsized internal effects from Mythos itself.

FYI

Gaps in the productivity picture

Anthropic's own survey showed wide variation in task-level gains around a 4x geometric mean, yet the company cautions that individual speedups do not translate directly to overall research acceleration once compute and coordination limits are included.

Sentiment

Many users praise Anthropic's Fable model as magical and a major capability leap, while others call the claims trash or criticize the short subscription trials as shady.

Pos
67.8%
Neg
32.2%
49 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS63.8KLIKES1KREPLIES76
Siqi Chen@blader

so far fable feels to me like the first real non-incremental step change in capability we've seen since opus 4.5 / gpt 5.2, a little over six months ago

anthropic absolutely COOKED with this one

a little sad for openai - i'm sure 5.6 is going to be good, but not this good

19hViews 63.8KLikes 1KBookmarks 57
BOOKMARKS145RETWEETS23
xjdr@_xjdr

i haven't used any Anthropic models since Jan / Feb so i was excited to unleash fable on a bunch of benchmarks and a few of my most complicated repos. so far, it seems like a huge improvement over opus, especially for claude code expert use cases but still not on par with gpt 5.5 xhigh for my specific use cases. in fact, its pretty on par with my fine tuned k2.6 outside of the new claude code features . the areas where it seems to excel are large multi part reviews (it caught a handful of really subtle and complex bugs) and multi-step long running tasks. i kept it away from my research / training and infra code for obvious reasons, so this is 'normal' software dev specific .

overall, solid effort and a huge improvement over the most recent opus, but not pushing the frontier in any meaningful ways (at least that i can see so far). i will probably use it for the rest of the day just to be sure and then move back to 80% k2.6 and 20% gpt 5.5 xhigh

22hViews 61.2KLikes 577Bookmarks 145

I really don't think OpenAI is going to let this slide. I've been saying it for a long time, the real inflection was when they reached 5.2. I have no clear insight on what they currently have internally, but if they haven't made a Mythos/Fable yet, it was *a choice*.

Andrew Curran@AndrewCurran_

The internal boost from Mythos-assisted development since February is just too big. Anthropic is pulling away from the pack for the first time, and at the same time they are also speeding up. The race legitimately feels like it is changing for the first time in years.

21hViews 47KLikes 466Bookmarks 81
Andrew Curran@AndrewCurran_

Quotes from the release today:

'Using Mythos 5, our internal protein design experts accelerated aspects of the drug design process by around ten times. In one example, they found that Mythos 5, with protein design and bioinformatics tools but no human assistance, matches or beats skilled human operators. In doing so, the model executes all of the tasks that are normally completed by a scientist: choosing binding sites, selecting and running protein design tools, and recovering from failures along the way.'

'During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand.'

'Mythos 5 is our first model to consistently produce novel, compelling scientific hypotheses. In blinded head-to-head comparisons against Opus-class models, our scientists preferred Mythos’s molecular biology hypotheses ~80% of the time, and have advanced several to experimental evaluation. In the meantime, one Mythos hypothesis—a novel mechanism for an E. coli protein—was corroborated in a study from a lab independently working on the same problem.'

'Mythos 5 conducted novel genomics research in over a week of largely autonomous work. It assembled single-cell data for millions of cells spanning 138 animal species and designed and trained a custom machine learning model to identify cells performing the same role in even distantly related organisms. With only high-level human input, Mythos 5’s trained model outperformed a recent model published in the journal Science—despite being 100 times smaller. We intend to publish these results in the coming months.'

Andrew Curran@AndrewCurran_

Karpathy: 'this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems.'

1dViews 9.5KLikes 134Bookmarks 24
Andrew Curran@AndrewCurran_

Karpathy: 'this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems.'

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

1dViews 16.4KLikes 140Bookmarks 7
Matt Shumer@mattshumer_

Kavin was able to get similar results as well… this model is crazy good

20hViews 25.8KLikes 66Bookmarks 9
Andrew Curran@AndrewCurran_

EM had early access. 'First, how good is Fable? In experiment after experiment I conducted, it outperformed basically every other public model I have used by a considerable margin. It was capable across many problems and produced some startling results — it would work up to a dozen hours executing on multi-page specifications.'

Ethan Mollick@emollick

I've had access to Fable for a bit. A genuine jump in capability, I could feed it a 15 page design document for a project and it would work for 9+ hours and deliver terrific results.

But working with it is weird & weirder is coming

Lots of examples: https://open.substack.com/pub/oneusefulthing/p/what-it-feels-like-to-work-with-mythos?r=i5f7&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true

1dViews 7.2KLikes 37Bookmarks 8
Miles Brundage@Miles_Brundage

@AndrewCurran_ I don't see any evidence of that. Mythos is a new pretrain. So was Spud. 4.6 will also presumably be good

Andrew Curran@AndrewCurran_

The internal boost from Mythos-assisted development since February is just too big. Anthropic is pulling away from the pack for the first time, and at the same time they are also speeding up. The race legitimately feels like it is changing for the first time in years.

1dViews 3.4KLikes 57Bookmarks 1
Andrew Curran@AndrewCurran_

Adding more reactions. Reactions from people I trust are off the charts.

22hViews 3.6KLikes 28Bookmarks 3
othy.h@othy_h

@teortaxesTex I got bearish since this, not the same vibes as the 6months ago interview on the mad podcast, he is focusing now on figuring out the "human-brain" learning algorithm (and focus on cost-efficiency is real) not sure if they trained mythos level (15T P) https://www.youtube.com/watch?v=N1geOimmdDo

20hViews 2.4KLikes 4Bookmarks 4
Prakash@8teAPi

first sparks of RSI

Andrew Curran@AndrewCurran_

The internal boost from Mythos-assisted development since February is just too big. Anthropic is pulling away from the pack for the first time, and at the same time they are also speeding up. The race legitimately feels like it is changing for the first time in years.

1dViews 3.9KLikes 14Bookmarks 2
Burito@Britoisinsane

@teortaxesTex Spud is already as big as Mythos. Just Ant’s architecture is somehow better

19hViews 10.8KLikes 4Bookmarks 1
vik@vikhyatk

@_xjdr alrighty looks like i'm lowering my ambitions to laguna xs.2

21hViews 644Likes 15

@teortaxesTex The incentives with 2-3 players always pretty much favored limiting data/techniques available to your opponents to train off of or copy. And when you’re on the frontier, only release what you have to to maintain that status, keeping more of the better capabilities internal longer

20hViews 772Likes 9
kache@yacineMTB

@_xjdr @menhguin owo

19hViews 159Likes 4
xjdr@_xjdr

@menhguin a few billion per repo / project

19hViews 755Likes 9
Burito@Britoisinsane

@teortaxesTex Decode is $50/Mtok vs $30/Mtok, and Ant has the higher margin Almost the same size I guess

19hViews 1.1KLikes 4
Ryan James@beezlebuddy

@deredleritt3r @AndrewCurran_ I respect your POV a lot, and I am surprised to read this, so I would love your reasoning behind that when you get it formed

1dViews 246Likes 9
Load more posts