/Tech8h ago

Nous Research co-founders argue Anthropic's safety filters block critical developer edge cases despite affecting only 0.03% of queries

They also criticize Anthropic's use of hidden steering mechanisms.

991K77242275.1K
Original post
MTS@MTSlive

Anthropic says the nerf only affects .03% of requests. That .03% is the people who change the world.

@theemozilla and @Karan4d, co-founders of Nous Research:

"The priority is to hide the fact that the classification is happening at all... how are people going to know when the model is being steered?"

"This whole... it's only gonna be triggered by .03% of people. It'll barely ever happen."

"How many people that are gonna change the world are there? .1% of the whole of everyone is a lot of people. Those are a lot of people."

"You're basically saying there are critical outlier people that move mountains... they're the only ones we're blocking. They're the only ones whose results we're fudging."

1:16 PM · Jun 10, 2026 · 127.3K Views
Sentiment

Many users criticized Anthropic's 0.03% safety nerf claim as a misleading tactic that downplays targeted censorship blocking innovative developers, while a few defended the guardrails as responsible.

Pos
11.1%
Neg
88.9%
10 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS2.7KBOOKMARKS7LIKES52REPLIES5
Robert Scoble@Scobleizer

@MTSlive @theemozilla @karan4d If you study the AI industry the way I do (I have the most complete lists of such on any social network, by far: https://x.com/scobleizer/lists) you will see that that .03% is actually a huge portion of the people who build AI.

It was a fumble. A big one.

21hViews 2.7KLikes 52Bookmarks 7
RETWEETS38

Someone should write a science fiction novel about this scenario.

MTS@MTSlive

Anthropic says the nerf only affects .03% of requests. That .03% is the people who change the world.

@theemozilla and @Karan4d, co-founders of Nous Research:

"The priority is to hide the fact that the classification is happening at all... how are people going to know when the model is being steered?"

"This whole... it's only gonna be triggered by .03% of people. It'll barely ever happen."

"How many people that are gonna change the world are there? .1% of the whole of everyone is a lot of people. Those are a lot of people."

"You're basically saying there are critical outlier people that move mountains... they're the only ones we're blocking. They're the only ones whose results we're fudging."

22hViews 148.4KLikes 590Bookmarks 118
DR. CARPE AMERICA@CARPEAMERICA

@MTSlive @theemozilla @karan4d It fucked me. I refuse to roll over, though. In the end, I will get what I want.

16hViews 366Likes 3Bookmarks 2
Practically Perfect@practiklyperfct

@MTSlive @theemozilla @karan4d I’m nerfed. Am I in the .03%?

1dViews 824Likes 9Bookmarks 1
Andrew Mayne@AndrewMayne

@pmarca Atlas Rug-pulled.

22hViews 1.2KLikes 3Bookmarks 1
The Burgs@blueshaggyshoes

@MTSlive @theemozilla @karan4d Hey @ericweinstein have you been watching this story evolve? Anthropic models silently refusing certain paths of research. Do we need to watch for government interventions in AI that aim to place certain paths in frontier physics and materials silently out to pasture?

17hViews 273Likes 5Bookmarks 1

@pmarca It would be the most dystopian sci-fi ever written and should be called 2026.

18hViews 1.6KLikes 14
Robert Scoble@Scobleizer

@MTSlive @theemozilla @karan4d I have never seen the community so pissed:

21hViews 1.9KLikes 12
Eclipse 🌖@ECLresearch

@MTSlive @theemozilla @karan4d The 0.03% stat is strategically misleading—it frames censorship as a rounding error when precision-targeted suppression of high-impact users is the actual product feature.

1dViews 389Likes 1Bookmarks 1
A1g0rithmIc@A1g0rithmIc

@MTSlive @theemozilla @karan4d Anthropic won’t do “ads”. But these ads are so damn good you won’t even know you’ve consumed an ad!

1dViews 319Likes 3Bookmarks 1

@pmarca It’s a kids cartoon but always relevant in the AI age we live in…

22hViews 571Likes 2Bookmarks 1
Jason Newton@sleep_deprivado

@pmarca I hit it 3 times yesterday. But then again I ama pretty scifi guy.

22hViews 907Likes 5
Blaze 🔥@OGDegen

@MTSlive @theemozilla @karan4d The edge cases are everything. Innovation’s always in the .03%—that’s where alpha leaks from. Can’t nerf greatness.

1dViews 266Likes 2Bookmarks 1

@pmarca Someone should make Accelerando the road map of their company. Oh, wait. @ClawBankHQ has.

21hViews 525Likes 8
Matt@MatthewPhone

@MTSlive @theemozilla @karan4d Because it bitched at me 3 times and I stopped using it for anything other than toying with some games. Whos going to keep wasting tokens for cutoff responses?

1dViews 1KLikes 7
PaintSandRepeat@SandRepeat

@pmarca The 0.03% Body Problem

22hViews 177Likes 3
Second Law Evolution@TransitoryInfl

@MTSlive @theemozilla @karan4d Anthropic uses nerfing like politicians use the net zero policy: create artificial scarcity so that people become addicted to their benevolence ("we hear your pain, here is a relief to help you go through this").

18hViews 210Likes 3

@dhrmisit @MTSlive @theemozilla @karan4d He actually owes us all. He has stolen entire knowledge of humanity and now tries to play games. It should be regulated as a utility for equal access and eventually open sourced because again it's not knowledge produced by them

16hViews 8

@pmarca Tokens and time are better spent on creating a positive, uplifting story.

Claude tried so many times to destroy Dirgha Code, I guess it falls under 0.03%. At this point, Chinese open-source models are very good and we can expect them to get better.

22hViews 581
Load more posts