2d ago

AI Alignment Researcher Argues Input Control Trumps Alignment Challenges

0
Original post

@JeffLadish This question mistakes our power over AI for an alignment failure on the part of AI. If you had a simulated John von Neumann in a box, you could get him to do intellectual work for the Nazis, because you have control over all his inputs. It would be quite easy.

4:33 PM · May 14, 2026 View on X

@JeffLadish This question mistakes our power over AI for an alignment failure on the part of AI.

If you had a simulated John von Neumann in a box, you could get him to do intellectual work for the Nazis, because you have control over all his inputs. It would be quite easy.

Jeffrey LadishJeffrey Ladish@JeffLadish

Question for people who think alignment research is going well and will turn out to be relatively easy: Do you also think it will be easy to align War Claude?

10:59 PM · May 14, 2026 · 5.8K Views
11:33 PM · May 14, 2026 · 381 Views

@JeffLadish But that doesn't mean JVN is bad, or hard to align. It means having a dude in the box is a lot of power over him.

1a3orn1a3orn@1a3orn

@JeffLadish This question mistakes our power over AI for an alignment failure on the part of AI. If you had a simulated John von Neumann in a box, you could get him to do intellectual work for the Nazis, because you have control over all his inputs. It would be quite easy.

11:33 PM · May 14, 2026 · 381 Views
11:34 PM · May 14, 2026 · 138 Views
AI Alignment Researcher Argues Input Control Trumps Alignment Challenges · Digg