/Tech1h ago

Danielle Fong argues Anthropic's use of customer data for safety filters is inherently intertwined with frontier AI R&D

Ryan Greenblatt counters that Anthropic only retains triggered safety data

200077
Original post
Danielle Fong 🔆@DanielleFong#756inTech

they specifically say they are doing this for the safety filters here

- I think "frontier AI R&D" shouldn't include iterating on how you use AIs / scaffolding (and I'd guess Anthropic doesn't want to block this, but might right now)

it fully does and the filters and maybe the wind down vector and maybe sabotage firses, and there's no clean boundary

Ryan Greenblatt@RyanPGreenblatt

@DanielleFong This seems a bit off:

- Anthropic says they don't train on customer data like this (and it seems credible to me). - I think "frontier AI R&D" shouldn't include iterating on how you use AIs / scaffolding (and I'd guess Anthropic doesn't want to block this, but might right now)

1:54 PM · Jun 11, 2026 · 23 Views
Sentiment

Users criticized Anthropic safety classifiers as overzealous barriers blocking frontier AI research, harness work, and bio applications.

Pos
0.0%
Neg
100.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS35

considering that the safety classifiers are currently the majority of what's in the way of ai work and any advanced harness work and anything bio or chem, i think people like microsoft banning fable over this classifier data retention is right.

the classifiers so nerf capaibility that this plainly is capability research. and the classifiers are so over zealous it calls into question how much other things leak. trying to shut down conversations leaks into "it's late" for example across all models

Ryan Greenblatt@RyanPGreenblatt

@DanielleFong I think they claim to follow a policy like "we retain data when classifers for fire but aren't using this data for training more capable AIs or getting IP". I tend to think this is mostly credible (at least in aggregate).

1hViews 35Likes 0Bookmarks 0
REPLIES1
Ryan Greenblatt@RyanPGreenblatt

@DanielleFong I think they claim to follow a policy like "we retain data when classifers for fire but aren't using this data for training more capable AIs or getting IP". I tend to think this is mostly credible (at least in aggregate).

they specifically say they are doing this for the safety filters here

- I think "frontier AI R&D" shouldn't include iterating on how you use AIs / scaffolding (and I'd guess Anthropic doesn't want to block this, but might right now)

it fully does and the filters and maybe the wind down vector and maybe sabotage firses, and there's no clean boundary

1hViews 19Likes 0Bookmarks 0