/Tech3h ago

Anthropic rolls back a policy that covertly degraded Claude Fable 5 performance for frontier AI researchers

Story Overview

Anthropic just reversed course on a hidden performance throttle in its newly released Claude Fable 5 model after frontier researchers complained the undisclosed limits felt like sabotage. The change follows the June 9 launch of the first public Mythos-class model and targets requests tied to building competing frontier systems.

40385232120.4K
Original post
rohit@krishnanrohit#1210inTech

Glad Anthropic's walked the covert deception back

1:53 AM · Jun 11, 2026 · 618 Views
Developer Impact

Visible fallbacks now replace secret throttling

Flagged queries will route to the older Opus 4.8 model with an explicit reason returned in the response, matching the handling already used for cyber and bio risks. Server-side rollout starts in the coming days.

Open Question

The right balance on enforcement remains unsettled

Anthropic apologized for the original tradeoff and said it wants safeguards to be transparent rather than covert, yet it is still unclear how broadly the new visible checks will apply or how quickly they will catch up to evolving research techniques.

Sentiment

Many users praised Anthropic for quickly reversing its policy limiting Claude for rival AI researchers, seeing the admission of the mistake and fast correction as the right response.

Pos
100.0%
Neg
0.0%
8 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS11.7KBOOKMARKS15LIKES231RETWEETS12REPLIES28
Chubby♨️@kimmonismus

That was quick: Anthropic reversed a controversial policy that would have secretly degraded Claude Fable 5 for users doing frontier AI research after backlash from researchers who saw it as covert sabotage of competing AI development.

3hViews 11.7KLikes 231Bookmarks 15
Chubby♨️@kimmonismus

https://www.wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/

Chubby♨️@kimmonismus

That was quick: Anthropic reversed a controversial policy that would have secretly degraded Claude Fable 5 for users doing frontier AI research after backlash from researchers who saw it as covert sabotage of competing AI development.

3hViews 2.9KLikes 9Bookmarks 1
gfodor.id@gfodor

Trisolarians report they have changed their minds and have “turned off” the sophons.

Whew that was close, anyhow back to work

Max Zeff@ZeffMax

NEW: Anthropic is walking back Claude Fable 5's policy to covertly degrade performance for competing AI researchers, after facing fierce backlash.

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” Anthropic tells WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”

8hViews 5.7KLikes 144Bookmarks 5
Alex Imas@alexolegimas

Congrats to @AnthropicAI for releasing an excellent model. Model roll outs are messy—it’s one thing to test and workshop model internally, whole other thing to have millions using it in practice. The key is hearing feedback and responding to it quickly, which is what the team has been doing.

Very pleased to hear Anthropic have walked back this policy https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/

11mViews 461Likes 8Bookmarks 0
Shailesh@0xThoughtVector

@gfodor buddy. they haven't removed the nerf. just made it WORSE but visible

8hViews 125Likes 8
SquirrelHandle#7523@SciurusAberti

@0xThoughtVector @gfodor His point is that we are just expected to take their word for it that they don't still do this silently sometimes too. Every output is still potentially poisoned.

7hViews 18Likes 1
Simon@SimonTheNoob

@krishnanrohit Did they though?

1hViews 3Likes 1

And Anthropic reverses this decision :)

You still can’t do ML research, but at least you will know it!

I still think that it's a shame that they are targeting ML research. I can understand safeguards that prevent distillation, but preventing ML research after you relied so heavily on open-source data, code, and papers is the wrong thing to do.

Max Zeff@ZeffMax

NEW: Anthropic is walking back Claude Fable 5's policy to covertly degrade performance for competing AI researchers, after facing fierce backlash.

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” Anthropic tells WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”

14mViews 124Likes 2Bookmarks 0
stromqx@stromqx

@gfodor Remember the chains of suspicion

5hViews 61Likes 1
truesteel@truesteel23

@gfodor I completely believe them

7hViews 50Likes 1
will@odinsbadeye

@gfodor Ok but does anyone really think that they're actually going to change their policy?

4hViews 48Likes 1
George Chen@georgechen

@gfodor Users tends to hold grudges. A lot of us are keeping score. 4.6 vs 4.7 controversy still on my mind.

7hViews 75
Peacerful@Peacerful

@kimmonismus another GPT 5.3 moment that's all. when 5.6 is released, they will remove the safeguards and that 22 june date, i think

3hViews 62
wyqtor@wyqtor

@gfodor They haven't turned them off, they just offer disclaimers now.

3hViews 32
ROXy@VK_ROXy

@kimmonismus Not a full rollback - the safeguards stay, just visible now instead of hidden.

3hViews 30
Chubby♨️@kimmonismus

@MacInTheLoop i fully agree with you, dont get me wrong

2hViews 23
dmitry@d_ilash

@kimmonismus It’s just a step one. Step two will just get rid of weird keyword triggers

3hViews 23
Barrak@BarrakAli

@kimmonismus Quietly degrading was the misstep, but listening and reversing fast is the kind of correction we should want to see more often.

3hViews 22
Hahn@BayernHahn

@kimmonismus At least they admit their mistakes

3hViews 22
Load more posts