1h ago

AI Safety Pioneer Recalls Early Fears Of Uncontrollable Self-Play Agents

22410799

——0——

Original post

#353Dylan HadfieldMenell@DHADFIELDMENELL

When I started working on AI Safety, the concern was that heavy self-play/multi-agent competition would effectively summon Homo Economicus and that it would be very hard to control/align.

8:36 AM · May 24, 2026

#353Dylan HadfieldMenell@DHADFIELDMENELL

Pretrained LLMs were a step away from that path. Lots of reasons why they were a much better substrate for alignment.

What’s been most disheartening to me about the last 18 months is that we’ve decided to go pedal to the metal back in that original direction.

Dylan HadfieldMenell@dhadfieldmenell

When I started working on AI Safety, the concern was that heavy self-play/multi-agent competition would effectively summon Homo Economicus and that it would be very hard to control/align.

3:36 PM · May 24, 2026 · 252 Views

3:36 PM · May 24, 2026 · 258 Views

QUOTE POST

#353Dylan HadfieldMenell@DHADFIELDMENELL

Reminds me of this undefeated 2023 tweet from @lxrjl:

3:37 PM · May 24, 2026 · 289 Views

AI Safety Pioneer Recalls Early Fears Of Uncontrollable Self-Play Agents

Sentiment

Cluster engagement