/Tech2h ago

Slow AI Progress Fuels Underappreciated Safety Pessimism

2101359

Original post unavailable.

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS53REPLIES1

David Manheim@davidmanheim

Yes, there are strong reasons to be more optimistic than we once were about risks from AI based on progress on some aspects of AI risk, but the continued lack of progress for other aspects should worry us.

2h53

LIKES1

David Manheim@davidmanheim

We expected that as we approached human-level capabilities, progress would accelerate. That is happening.

We also expected that stronger AI systems would help us with alignment. That *isn’t* happening.

And people don’t seem to update to being more worried; I think they should.

2h411

David Manheim@davidmanheim

Sonnet is not frontier, so if alignment progress was happening, we should expect that developers could make it safer than earlier models. And it doesn’t pose more risk than previous models, but Anthropic hasn’t made it materially more safe than previous attempts either.

2h50

David Manheim@davidmanheim

Sonnet 5 was just released. It’s about as aligned as Opus 4.8, and still has most of the same failure modes, at similar rates, despite being a weaker model, despite the progress enabled by Mythos, and despite (presumably) continued work on alignment.

2h33