/Tech16h ago

Interpretability researcher Nick Cammarata warns Anthropic may be silently restricting model capabilities, disrupting safety research

He lacks reliable benchmarks to confirm these capability changes.

33680133940.3K
Original post
Nick@nickcammarata#407inTech

i think it's bad for anthropic to nerf ml silently. I don't know if interpretability counts as frontier ai model research or not. everything i'm doing is differentially for safety, idk if i'm being nerfed, and don't have great benchmarks to tell

8:16 PM · Jun 9, 2026 · 24.9K Views
Sentiment

Some users express optimism that the Anthropic researcher will regain unnerfed access to ML safety work because his connections may help bypass the restrictions.

Pos
100.0%
Neg
0.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS3.5K
Nick@nickcammarata

unclear if the ml nerfs are permanent, would be nice for anthropic to say something on this explicitly if it is the case

16hViews 3.5KLikes 24Bookmarks 3
BOOKMARKS4LIKES119
Nick@nickcammarata

i think the best would be for anthropic to work with more orgs to have unnerfed models. i'm a pretty big anthropic stan but the world where you have to join anthropic specifically to do your best safety work i think is not the ideal world

Nick@nickcammarata

i think it's bad for anthropic to nerf ml silently. I don't know if interpretability counts as frontier ai model research or not. everything i'm doing is differentially for safety, idk if i'm being nerfed, and don't have great benchmarks to tell

16hViews 3.4KLikes 119Bookmarks 4
RETWEETS2
Nick@nickcammarata

oh now won't let me use fable at all. i was using it for interp, mostly working (unclear if nerfed). in a separate chat i asked a random question about papayas i was curious about, it got flagged as biosecurity concern?? now won't let me talk about interp in other chats either

Nick@nickcammarata

unclear if the ml nerfs are permanent, would be nice for anthropic to say something on this explicitly if it is the case

15hViews 2.3KLikes 47Bookmarks 3
REPLIES4
Nick@nickcammarata

using agents for interp is confusing enough without knowing if your agent is being probed to make it silently dumb on purpose

Nick@nickcammarata

i think the best would be for anthropic to work with more orgs to have unnerfed models. i'm a pretty big anthropic stan but the world where you have to join anthropic specifically to do your best safety work i think is not the ideal world

16hViews 3.4KLikes 82Bookmarks 3
Nick@nickcammarata

oh if they said this i missed it, if it's only a short term thing i think that's fine, they're going through a lot and i think it's reasonable to cut them a lot of slack right now

16hViews 2.6KLikes 30Bookmarks 3
bayes@bayeslord

@nickcammarata I’m at like >50% chance you will get unnerfed access

Nick@nickcammarata

using agents for interp is confusing enough without knowing if your agent is being probed to make it silently dumb on purpose

16hViews 127Likes 3Bookmarks 0
Nick@nickcammarata

@bayeslord bc they specifically judged my account as good or bc the work i'm doing will be judged as not frontier training

bayes@bayeslord

@nickcammarata I’m at like >50% chance you will get unnerfed access

16hViews 134Likes 3Bookmarks 0
bayes@bayeslord

@nickcammarata Both are included as well as eg affiliative factors

16hViews 54Likes 2
Aaron Bergman 🔍@AaronBergman18

@nickcammarata I feel like you specifically might be well connected + good enough to get around this by talking to some people

Idk/I have no special info and don’t really know what ur working on more specifically

16hViews 84Likes 5
Liv@livgorton

@nickcammarata

all safeguards will be improved

15hViews 9Likes 0Bookmarks 0
Nick@nickcammarata

@bayeslord you're right, anthropic is a well known member of the jhana community of which i am also affiliated

16hViews 41Likes 1
huli@honorablepicnic

@nickcammarata Why do you think they are doing it silently? It's interesting since they're doing it loudly in so many categories, including distillation

16hViews 12