10h ago

Richard Ngo and Toby Ord examine how AI safety efforts produce unintended outcomes including accelerated progress and institutional advantages from safety rhetoric

Ord is senior researcher at Oxford University's AI Governance Initiative.

0
Original post

@RichardMCNgo I agree that work on understanding and overcoming dangers can backfire and it is useful to know when it does (and also when it genuinely helped and what the balance between those is).

12:53 AM · May 20, 2026 View on X

Part of the problem is conflating actual attempts to understand things with the use of “understanding” or “safety” as fig leaves for power-seeking.

Rationalists do the former much more than EAs, which is why they have many fewer prestigious jobs, lab equity, etc (and get called weird much more). Most of the best thinkers I know (e.g. people I follow on @researchngo) are institutionally independent by now.

In general my advice is to cultivate virtues, many of which are culturally transmitted defense mechanisms against counterproductive behavior. Virtue is hard to pin down but it’s often fairly obvious if you’re looking for it. As one example, the old Racing to the Precipice paper had much more intellectual virtue than Situational Awareness—the latter was skewed in many ways by social incentives, as you can see by reading the text.

Meanwhile most “safety” work inside labs is so shaped by corporate and/or ideological incentives that it’s hard to extract meaningful signal from it. These papers are not good science, but they feel so useful (slash people are so panicked about short timelines) that the field embraces them anyway.

More generally, these days it’s much easier to convey important ideas with nuance and clarity in blog posts than in papers. Insofar as people still write papers for the prestige and recognition, I’d say that’s also less intellectually virtuous. (There’s still a role for papers in pinning down a legible idea for adversarial scrutiny, but I’d recommend something like a 10:1 ratio of posts to papers on a given topic.)

Toby OrdToby Ord@tobyordoxford

@RichardMCNgo The former seems hard to support. It could be true, though is a very pessimistic world where all knowledge on the most important questions is bad and we can only fly blind. If you mean the latter, then advice on how you think they could do it better would be useful.

7:57 AM · May 20, 2026 · 138 Views
9:50 AM · May 20, 2026 · 5 Views

Part of the problem is conflating actual attempts to understand things with the use of “understanding” or “safety” as fig leaves for power-seeking.

Rationalists do the former much more than EAs, which is why they have many fewer prestigious jobs, lab equity, etc (and get called weird much more). Most of the best thinkers I know (e.g. people I follow on @researchngo) are institutionally independent by now.

In general my advice is to cultivate virtues, many of which are culturally transmitted defense mechanisms against counterproductive behavior. Virtue is hard to pin down but it’s often fairly obvious if you’re looking for it. As one example, the old Racing to the Precipice paper had much more intellectual virtue than Situational Awareness—the latter was skewed in many ways by social incentives, as you can see by reading the text.

Meanwhile most “safety” work inside labs is so shaped by corporate and/or ideological incentives that it’s hard to extract meaningful signal from it. Few of these papers are good science, but they feel so useful (slash people are so panicked about short timelines) that the field embraces them anyway.

More generally, these days it’s much easier to convey important ideas with nuance and clarity in blog posts than in papers. Insofar as people still write papers for the prestige and recognition, I’d say that’s also less intellectually virtuous. (There’s still a role for papers in pinning down a legible idea for adversarial scrutiny, but I’d recommend something like a 10:1 ratio of posts to papers on a given topic.)

Toby OrdToby Ord@tobyordoxford

@RichardMCNgo The former seems hard to support. It could be true, though is a very pessimistic world where all knowledge on the most important questions is bad and we can only fly blind. If you mean the latter, then advice on how you think they could do it better would be useful.

7:57 AM · May 20, 2026 · 138 Views
9:52 AM · May 20, 2026 · 196 Views

@RichardMCNgo But I don't quite understand what you are trying to say with your threads on this topic. Are you saying people should 100% stop investigating and understanding the key dangers of our time? Or saying that they need to be more cautious in how they do so?

Toby OrdToby Ord@tobyordoxford

@RichardMCNgo I agree that work on understanding and overcoming dangers can backfire and it is useful to know when it does (and also when it genuinely helped and what the balance between those is).

7:53 AM · May 20, 2026 · 57 Views
7:54 AM · May 20, 2026 · 80 Views

@RichardMCNgo The former seems hard to support. It could be true, though is a very pessimistic world where all knowledge on the most important questions is bad and we can only fly blind. If you mean the latter, then advice on how you think they could do it better would be useful.

Toby OrdToby Ord@tobyordoxford

@RichardMCNgo But I don't quite understand what you are trying to say with your threads on this topic. Are you saying people should 100% stop investigating and understanding the key dangers of our time? Or saying that they need to be more cautious in how they do so?

7:54 AM · May 20, 2026 · 80 Views
7:57 AM · May 20, 2026 · 138 Views