
Yes, there are strong reasons to be more optimistic than we once were about risks from AI based on progress on some aspects of AI risk, but the continued lack of progress for other aspects should worry us.
No Digg Deeper questions have been answered for this story yet.

Yes, there are strong reasons to be more optimistic than we once were about risks from AI based on progress on some aspects of AI risk, but the continued lack of progress for other aspects should worry us.

We expected that as we approached human-level capabilities, progress would accelerate. That is happening.
We also expected that stronger AI systems would help us with alignment. That *isn’t* happening.
And people don’t seem to update to being more worried; I think they should.

Sonnet is not frontier, so if alignment progress was happening, we should expect that developers could make it safer than earlier models. And it doesn’t pose more risk than previous models, but Anthropic hasn’t made it materially more safe than previous attempts either.