VIEWS49BOOKMARKS1RETWEETS1

Aaron Scher@aaronscher
The best piece about the current state of alignment https://www.lesswrong.com/posts/WewsByywWNhX9rtwi/current-ais-seem-pretty-misaligned-to-me
10hViews 49Likes 1Bookmarks 1
Users praise the article challenging market incentives for AI alignment as the best piece on the current state of the field.

The best piece about the current state of alignment https://www.lesswrong.com/posts/WewsByywWNhX9rtwi/current-ais-seem-pretty-misaligned-to-me

We don't have the ideal comparison data here: ideally there would be a Sonnet 4.5 level model that was very honest, didn't overclaim, never reward hacked, etc. But we have some evidence: current models aren't very well aligned, and people still use them & handoff important tasks.

@aaronscher Another piece of evidence is that market didn't converge on this: https://gwern.net/guardian-angel