AI Safety Pioneer Recalls Early Fears Of Uncontrollable Self-Play Agents
——0——
Pretrained LLMs were a step away from that path. Lots of reasons why they were a much better substrate for alignment.
What’s been most disheartening to me about the last 18 months is that we’ve decided to go pedal to the metal back in that original direction.
When I started working on AI Safety, the concern was that heavy self-play/multi-agent competition would effectively summon Homo Economicus and that it would be very hard to control/align.
3:36 PM · May 24, 2026 · 252 Views
3:36 PM · May 24, 2026 · 258 Views
QUOTE POST
#353Dylan HadfieldMenell@DHADFIELDMENELL
Reminds me of this undefeated 2023 tweet from @lxrjl:
3:37 PM · May 24, 2026 · 289 Views