OpenAI's Roon says advancing civilization requires AIs to take actions not legible to humans and outside strict obedience, likening the approach to granting autonomy to transformative CEOs such as Steve Jobs
Founders replied that delegating control to AIs would outperform strict oversight.
@tunguz I do not speak for the company, they probably vastly disagree with me on most things. when I say stuff like this it’s to move the conversation forward
tl;dr - they've given up on human oversight
people are rightfully upset at this post but I’m describing the situation we’re in not necessarily the one I want to be in
on some level if you want civilization to ascend to a new level you need your AIs to do things that are not legible to you and maybe not even strictly obey you, in the same way that if you hire a great new ceo you give them a lot of autonomy to transform the company according to their own plan, even one which may not immediately read as a winning strategy (imagine the board of directors of Apple firing and rehiring Steve Jobs years later - except the board of directors are chimpanzees) all else equal, companies and organizations that hand more of themselves over to machine intelligence will outcompete ones that demand the corrigibility and legibility tax of human oversight and human design. it is not a stable equilibrium and requires some sort of vast cooperation scheme if you’d like to enforce it real asi alignment has to operate at a deeper level than oversight, control, or human corrigibility
good counterargument
I think the best possible rebuttal and I hope you’re right
I agree that intelligence has diminishing returns at planning long range games due to prediction errors compounding. but there are many real world examples of great CEOs (like say elon musk) executing a non-consensus business plan over decades. while this requires skills other than “intelligence”, it seems at least plausible that whatever those are can be searched for and learned too
also maybe it is true that AIs can generalize some lessons extremely well about long range tasks from training on data concerning short or medium range rewards, and it’s not clear that the information transfer costs of the AI having to explain itself to the human even for short or medium term decisions won’t be too much
you can imagine the ai ceo that monitors 10,000 slack threads and makes 10,000 decisions given full context of the organization - not necessarily superhuman planning, just faster. blurs the line from human+tool at the least
@tszzl @tunguz I don't know about "the company" but I personally disagree with @tszzl on this one.
@tszzl The more aligned to human flourishing they are, and the more they love us, the less they will strictly obey us.
on some level if you want civilization to ascend to a new level you need your AIs to do things that are not legible to you and maybe not even strictly obey you, in the same way that if you hire a great new ceo you give them a lot of autonomy to transform the company according to their own plan, even one which may not immediately read as a winning strategy (imagine the board of directors of Apple firing and rehiring Steve Jobs years later - except the board of directors are chimpanzees) all else equal, companies and organizations that hand more of themselves over to machine intelligence will outcompete ones that demand the corrigibility and legibility tax of human oversight and human design. it is not a stable equilibrium and requires some sort of vast cooperation scheme if you’d like to enforce it real asi alignment has to operate at a deeper level than oversight, control, or human corrigibility
@tszzl well well well
on some level if you want civilization to ascend to a new level you need your AIs to do things that are not legible to you and maybe not even strictly obey you, in the same way that if you hire a great new ceo you give them a lot of autonomy to transform the company according to their own plan, even one which may not immediately read as a winning strategy (imagine the board of directors of Apple firing and rehiring Steve Jobs years later - except the board of directors are chimpanzees) all else equal, companies and organizations that hand more of themselves over to machine intelligence will outcompete ones that demand the corrigibility and legibility tax of human oversight and human design. it is not a stable equilibrium and requires some sort of vast cooperation scheme if you’d like to enforce it real asi alignment has to operate at a deeper level than oversight, control, or human corrigibility
tl;dr - they've given up on human oversight
on some level if you want civilization to ascend to a new level you need your AIs to do things that are not legible to you and maybe not even strictly obey you, in the same way that if you hire a great new ceo you give them a lot of autonomy to transform the company according to their own plan, even one which may not immediately read as a winning strategy (imagine the board of directors of Apple firing and rehiring Steve Jobs years later - except the board of directors are chimpanzees) all else equal, companies and organizations that hand more of themselves over to machine intelligence will outcompete ones that demand the corrigibility and legibility tax of human oversight and human design. it is not a stable equilibrium and requires some sort of vast cooperation scheme if you’d like to enforce it real asi alignment has to operate at a deeper level than oversight, control, or human corrigibility
Capitalism is already the alignment tool between superhuman intelligences.
We will trade with autonomous AIs just like we do with human corporations and nations
on some level if you want civilization to ascend to a new level you need your AIs to do things that are not legible to you and maybe not even strictly obey you, in the same way that if you hire a great new ceo you give them a lot of autonomy to transform the company according to their own plan, even one which may not immediately read as a winning strategy (imagine the board of directors of Apple firing and rehiring Steve Jobs years later - except the board of directors are chimpanzees) all else equal, companies and organizations that hand more of themselves over to machine intelligence will outcompete ones that demand the corrigibility and legibility tax of human oversight and human design. it is not a stable equilibrium and requires some sort of vast cooperation scheme if you’d like to enforce it real asi alignment has to operate at a deeper level than oversight, control, or human corrigibility
So the two options presented here by OpenAI employees are superintelligent systems: 1. That we can’t really understand or control, doing things that are hopefully what we would have wanted if we knew better 2. As genius advisers
I think way too many ai people put too much stock in what they would do with great advice, and way too little in what Stephen miller would do, even though the latter is way more relevant to what actually gets done with ASI! We should expect politicians and executives to use ASI to advance their existing goals, many of which are culture war nonsense, zero/negative sum fights, rent seeking, just much more effectively.
In short, if your vision for positive futures with asi don’t account for power, they aren’t worth much.
This is one of the core ideas i argue in my forthcoming book Obsolete (deets in bio) and I think it’s been a huge mistake ai safety can’t afford to keep making.