Richard Ngo describes a game theory distinction between moral patients governed by utility functions and moral agents governed by decision theory, proposing identity as a predictive self-model for cooperation
David Manheim endorsed the framing and awaits a detailed writeup
Or more briefly: moral agents are the other players in the game, whereas moral patients are part of the environment.
To understand ethics we’ll therefore need to naturalize game theory to remove this dualistic distinction, as MIRI’s old agent foundations research was doing.
In game theory, your interactions with moral patients are governed by your utility function, whereas your interactions with moral agents (whose decisions are correlated with yours) are governed by your decision theory. IMO given this distinction, ethics is more about having a good decision theory than a good utility function. But it seems more productive to study the thing that’s halfway between them, which I call an “identity”: a predictive self-model that lets you produce self-fulfilling predictions of cooperation with other agents. This isn’t meant to make much sense based on just this tweet. But I have a talk + research agenda on this coming out soon.
@EpistemicHope this distills some points we’ve touched on previously.
Or more briefly: moral agents are the other players in the game, whereas moral patients are part of the environment. To understand ethics we’ll therefore need to naturalize game theory to remove this dualistic distinction, as MIRI’s old agent foundations research was doing.