/Tech10h ago

LessWrong Post Argues Current AIs Seem Pretty Misaligned

422261.6K

Original post unavailable.

/Tech10h ago

LessWrong Post Argues Current AIs Seem Pretty Misaligned

422261.6K

Original post unavailable.

Sentiment

Users praise the article challenging market incentives for AI alignment as the best piece on the current state of the field.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS49BOOKMARKS1RETWEETS1

Aaron Scher@aaronscher

The best piece about the current state of alignment https://www.lesswrong.com/posts/WewsByywWNhX9rtwi/current-ais-seem-pretty-misaligned-to-me

10h4911

LIKES1REPLIES1

Aaron Scher@aaronscher

We don't have the ideal comparison data here: ideally there would be a Sonnet 4.5 level model that was very honest, didn't overclaim, never reward hacked, etc. But we have some evidence: current models aren't very well aligned, and people still use them & handoff important tasks.

10h231

quetzal_rainbow@quetzal_rainbow

@aaronscher Another piece of evidence is that market didn't converge on this: https://gwern.net/guardian-angel

9h13