/AI4h ago

LessWrong Post Argues Current AIs Seem Pretty Misaligned

41224606
Original post unavailable.
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS49BOOKMARKS1RETWEETS1
Aaron Scher@aaronscher

The best piece about the current state of alignment https://www.lesswrong.com/posts/WewsByywWNhX9rtwi/current-ais-seem-pretty-misaligned-to-me

4hViews 49Likes 1Bookmarks 1
LIKES1REPLIES1
Aaron Scher@aaronscher

We don't have the ideal comparison data here: ideally there would be a Sonnet 4.5 level model that was very honest, didn't overclaim, never reward hacked, etc. But we have some evidence: current models aren't very well aligned, and people still use them & handoff important tasks.

4hViews 23Likes 1
quetzal_rainbow@quetzal_rainbow

@aaronscher Another piece of evidence is that market didn't converge on this: https://gwern.net/guardian-angel

3hViews 13