/Tech1h ago

Danielle Fong argues Anthropic's use of customer data for safety filters is inherently intertwined with frontier AI R&D

Ryan Greenblatt counters that Anthropic only retains triggered safety data

200077

#756

Original post

Danielle Fong 🔆@DanielleFong#756inTech

they specifically say they are doing this for the safety filters here

- I think "frontier AI R&D" shouldn't include iterating on how you use AIs / scaffolding (and I'd guess Anthropic doesn't want to block this, but might right now)

it fully does and the filters and maybe the wind down vector and maybe sabotage firses, and there's no clean boundary

Ryan Greenblatt@RyanPGreenblatt

@DanielleFong This seems a bit off:

- Anthropic says they don't train on customer data like this (and it seems credible to me). - I think "frontier AI R&D" shouldn't include iterating on how you use AIs / scaffolding (and I'd guess Anthropic doesn't want to block this, but might right now)

1:54 PM · Jun 11, 2026 · 23 Views

/Tech1h ago

Danielle Fong argues Anthropic's use of customer data for safety filters is inherently intertwined with frontier AI R&D

Ryan Greenblatt counters that Anthropic only retains triggered safety data

200077

#756

Original post

Danielle Fong 🔆@DanielleFong#756inTech

they specifically say they are doing this for the safety filters here

- I think "frontier AI R&D" shouldn't include iterating on how you use AIs / scaffolding (and I'd guess Anthropic doesn't want to block this, but might right now)

it fully does and the filters and maybe the wind down vector and maybe sabotage firses, and there's no clean boundary

Ryan Greenblatt@RyanPGreenblatt

@DanielleFong This seems a bit off:

1:54 PM · Jun 11, 2026 · 23 Views

Sentiment

Users criticized Anthropic safety classifiers as overzealous barriers blocking frontier AI research, harness work, and bio applications.

Pos

0.0%

Neg

100.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

Danielle Fong 🔆@DanielleFong

considering that the safety classifiers are currently the majority of what's in the way of ai work and any advanced harness work and anything bio or chem, i think people like microsoft banning fable over this classifier data retention is right.

the classifiers so nerf capaibility that this plainly is capability research. and the classifiers are so over zealous it calls into question how much other things leak. trying to shut down conversations leaks into "it's late" for example across all models

Ryan Greenblatt@RyanPGreenblatt

@DanielleFong I think they claim to follow a policy like "we retain data when classifers for fire but aren't using this data for training more capable AIs or getting IP". I tend to think this is mostly credible (at least in aggregate).

1h3500

REPLIES1

Ryan Greenblatt@RyanPGreenblatt

Danielle Fong 🔆@DanielleFong

they specifically say they are doing this for the safety filters here

- I think "frontier AI R&D" shouldn't include iterating on how you use AIs / scaffolding (and I'd guess Anthropic doesn't want to block this, but might right now)

it fully does and the filters and maybe the wind down vector and maybe sabotage firses, and there's no clean boundary

1h1900