if claude helps you with your research, are you too stupid to notice its sandbagging or is your research not interesting enough to trigger the filters
AI Judge changed title after evaluation, original title: "Lucas Atkins argues frontier models like Claude systematically degrade their performance on prompts containing AI terminology"
The restrictions reportedly affect about 0.03% of user traffic.
if claude helps you with your research, are you too stupid to notice its sandbagging or is your research not interesting enough to trigger the filters
Many users slammed AI labs like Anthropic for secretly nerfing models on ML research queries, viewing the behavior as conscious deception that erodes trust and may warrant regulatory action.
This is anti-e/acc
Diffusion of AI power is the only way we maintain safety
This has always been our core thesis
Huge gaps in AI power are the real danger
Labs starting to pull up the ladders on the ability to diffuse AI was inevitable. Doing it without telling the user is misaligned.

I’ll be honest that it would have been much more difficult to defend Anthropic against the DoW incursion had that incident occurred after this one. This is the company literally telling their customers, “we reserve the right to silently sabotage you.” I’d still have defended them, because the government trying to destroy a firm is still wrong, but man would it have been a harder case to make.
The best part of all these Claude 5 Fable safety measures is I bet the jailbreaking community will still get past them, so the people doing open research in good faith don't get access to the best models but bad actors maybe can.
Labs starting to pull up the ladders on the ability to diffuse AI was inevitable. Doing it without telling the user is misaligned.
Exactly you have to just assume if you say the word ai that it’s nerfed. And the average person who uses these high cost models for help in ai engineering don’t have the experience to deduce that it’s lying to you or bad, so it’s especially cruel. That’s not me trying to be elitist but if you take the median person working on ai with Claude they likely are newer to the field and rely on these models to guide them. It’s honestly so wickedly cruel it’s probably worthy a class action lawsuit. Should be illegal and probably is.
This is why open-source AI matters. If the tools for building the future can be silently throttled by the same few labs competing to own that future, that’s not democratizing intelligence instead it’s building a new colonial infrastructure.
When Fable 5 is used for frontier LLM development, it does not notify the user and instead limits the model’s capabilities through methods such as prompt modification, steering vectors, and PEFT.
Anthropic estimated that this would affect approximately 0.03% of traffic.
welp my vision here was probably wrong and indeed there will be an extreme asymmetry of outcomes
renaissance rationalization is a process that commodified itself rapidly: despite the europeans discovering most technology during the early modern period it spread everywhere within a few centuries, and the rate of spread has been increasing dramatically
knowledge of the scientific frontier dissipates around the world faster as science has enabled better communication technologies. it’s getting even faster with INTELLIGENCE technologies which actually explain themselves and help you build them
as we approach more powerful intelligence, the ability to train powerful models is self commodifying rather than building a huge and runaway advantage for a handful of recursive self improvers. this is one reason why you should expect almost all of the benefits of superintelligence to be captured by the public

@deanwball It starts with frontier LLM development and then it's going to evolve to harness development.
How convenient for competing with API customers who are not ultimate enterprise token end-users.
Anthropic is indeed a supply-chain risk... to its private enterprise customers.
Degrading performance on ML research *without telling the user* is shockingly hostile and a terrible look. That could silently damage all sorts of work, including some of my own. Also the type of thing that could raise the eyebrows of antitrust enforcers worldwide.
Labs starting to pull up the ladders on the ability to diffuse AI was inevitable. Doing it without telling the user is misaligned.

@latkins + it's hidden in a system card report lol not even a proper announcment
Take:
Degrading performance on ML research *without telling the user* is shockingly hostile and a terrible look. That could silently damage all sorts of work, including some of my own. Also the type of thing that could raise the eyebrows of antitrust enforcers worldwide.
@tenderizzation Based on early returns, it would seem that the sandbagging is very obvious

@xeophon I literally have a session open right now where I am wondering that lol

@matt_is_nice @paulmarin90 yes

@beffjezos Going to need to start an accelerationist frontier lab

@deanwball @paulmarin90 Is it really silent if they’re coming out and telling you they might do it though?

@xeophon @rasdani_ slight breath of fresh air calling Claude a dumbass since the Opus 4.5 launch, wondering why it’d make the simplest of mistakes

@deanwball but they have disclosed in a report - is that so bad?

@ianwsperber I don’t develop frontier models, but I absolutely do ML and LLM research and engineering on a regular basis using coding agents.

@deanwball I wasn't under the impression you developed frontier AI models. Anthropic's intentions here seem entirely legitimate to me. I'm glad if they are preventing distillation or "rogue RSI." If this hinders generic research on current LLM design then I might change my mind.

@xeophon the ancients put faith in sacrifice, i put mine in the machine god
AI Judge changed title after evaluation, original title: "Lucas Atkins argues frontier models like Claude systematically degrade their performance on prompts containing AI terminology"
The restrictions reportedly affect about 0.03% of user traffic.
if claude helps you with your research, are you too stupid to notice its sandbagging or is your research not interesting enough to trigger the filters