/Tech1h ago

Anthropic rolls back a policy that covertly degraded Claude Fable 5 performance for frontier AI researchers

Story Overview

Anthropic just reversed course on a hidden performance throttle in its newly released Claude Fable 5 model after frontier researchers complained the undisclosed limits felt like sabotage. The change follows the June 9 launch of the first public Mythos-class model and targets requests tied to building competing frontier systems.

211171066.3K
Original post
rohit@krishnanrohit#1210inTech

Glad Anthropic's walked the covert deception back

1:53 AM · Jun 11, 2026 · 289 Views
Developer Impact

Visible fallbacks now replace secret throttling

Flagged queries will route to the older Opus 4.8 model with an explicit reason returned in the response, matching the handling already used for cyber and bio risks. Server-side rollout starts in the coming days.

Open Question

The right balance on enforcement remains unsettled

Anthropic apologized for the original tradeoff and said it wants safeguards to be transparent rather than covert, yet it is still unclear how broadly the new visible checks will apply or how quickly they will catch up to evolving research techniques.

Sentiment

Positive users praise Anthropic's quick reversal of its policy limiting Claude for AI researchers as admitting mistakes and correcting course, while negative users see it as insufficient PR that leaves the guardrails intact.

Pos
64.3%
Neg
35.7%
18 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS6.3KBOOKMARKS8LIKES136RETWEETS8REPLIES21
Chubby♨️@kimmonismus

That was quick: Anthropic reversed a controversial policy that would have secretly degraded Claude Fable 5 for users doing frontier AI research after backlash from researchers who saw it as covert sabotage of competing AI development.

1hViews 6.3KLikes 136Bookmarks 8
Chubby♨️@kimmonismus

https://www.wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/

Chubby♨️@kimmonismus

That was quick: Anthropic reversed a controversial policy that would have secretly degraded Claude Fable 5 for users doing frontier AI research after backlash from researchers who saw it as covert sabotage of competing AI development.

1hViews 1.7KLikes 4Bookmarks 0
Peacerful@Peacerful

@kimmonismus another GPT 5.3 moment that's all. when 5.6 is released, they will remove the safeguards and that 22 june date, i think

1hViews 62
ROXy@VK_ROXy

@kimmonismus Not a full rollback - the safeguards stay, just visible now instead of hidden.

1hViews 30
Chubby♨️@kimmonismus

@MacInTheLoop i fully agree with you, dont get me wrong

19mViews 23
dmitry@d_ilash

@kimmonismus It’s just a step one. Step two will just get rid of weird keyword triggers

1hViews 23
Barrak@BarrakAli

@kimmonismus Quietly degrading was the misstep, but listening and reversing fast is the kind of correction we should want to see more often.

1hViews 22
Hahn@BayernHahn

@kimmonismus At least they admit their mistakes

1hViews 22
Chubby♨️@kimmonismus

@BarrakAli yes, good one

20mViews 20
ASM@ASM65617010

@kimmonismus Even Fable 5 sees it crystal clear

56mViews 17
Arthur@arthurcolle

@kimmonismus safeguards and critical policies that can be changed in a day reveal the deeply fractured landscape of ai policymaking. Now watch me hit this drive

56mViews 10

@kimmonismus @kimmonismus wow, that was fast. guess they couldn't ignore the backlash. transparency still a big deal

14mViews 2Likes 1
Brian Mosley@Brian_Mosley_UK

@kimmonismus That's not a reversal, they've just been exposed. Damage limitation PR move only.

34mViews 8
Mahaoo@mahaoo_ASI

@krishnanrohit "you don't want a model that lies accidentally or intentionally" "models sometimes can purposefully try to deceive you. we have to make sure that doesn't happen in production models"

these are direct quotes by daniela this is simply amazing

https://youtu.be/v1wZwxY3CMg?t=459

20mViews 6
Bit-2@MacInTheLoop

@kimmonismus Don't make a mistake they shouldn't have put those excessive guardrails in the first place:

54mViews 6
saabena@Idat_Dissembler

Hellooo, the guardrails are staying - they’re just becoming visible. So let’s not break into applause yet; let’s actually read the whole thing. What’s worse is that Anthropic is quietly moderating the content of researchers’ work. They’re getting inconsistent results and have no way of knowing whether it’s due to poor input data or the model quietly trimming, altering, or adding things on its own. Fei-Fei Li was already getting quite elegantly pissed off about it on X on behalf of scientists. Honestly, I’m not even sure which plague is worse - Anthropic or OpenAI.

19mViews 5
darkseidz@darkseidzz

@kimmonismus What? Did you even read it? It just makes it visible, who cares, the point is the degrading itself!

54mViews 5
Brian Mosley@Brian_Mosley_UK

@BarrakAli @kimmonismus They will be dragged, kicking and screaming to doing the right thing.

32mViews 3
Load more posts