Richard Ngo says Effective Altruism mainly empowered Anthropic
Richard Ngo an independent AI researcher focused on AGI safety alignment and governance stated that Effective Altruism primarily empowered Anthropic through narratives funding and talent pipelines. This result diverges from the movement's expected routes to influence. Oliver Habryka replied that larger effects remain possible through future government actions in AI and related domains that would not primarily channel through Anthropic.
The biggest effect that EA has had on the world is empowering Anthropic (via narratives, funding and talent).
Whether good or bad, that’s so different from EAs’ anticipated paths to impact that it indicates a massive blind spot.
EA’s blind spot is centered on adversarial dynamics. To fix it you must sometimes set aside “intentions” and ask what the system actually produces (POSIWID).
Cynically: EA’s purpose is to funnel resources to power-seekers who self-deceive enough to consider themselves altruists.
The biggest effect that EA has had on the world is empowering Anthropic (via narratives, funding and talent). Whether good or bad, that’s so different from EAs’ anticipated paths to impact that it indicates a massive blind spot.
Nor does it seem plausible that EA will have bigger impacts in the future. The rest of the movement (outside Anthropic) lacks clarity, drive and leadership.
EA is now a live player only insofar as Anthropic is a live (and EA-aligned) player.
EA’s blind spot is centered on adversarial dynamics. To fix it you must sometimes set aside “intentions” and ask what the system actually produces (POSIWID). Cynically: EA’s purpose is to funnel resources to power-seekers who self-deceive enough to consider themselves altruists.
Also when I say “funding” I’m primarily referring to the FTX investment, which I understand was important for getting Anthropic off the ground.
And ofc SBF was another of the self-deceiving power-seekers I mentioned.
The particularly scary thing about this diagnosis is that it’s not limited to *human* power-seekers. Anthropic is turning into a machine for giving Claude more power as long as Claude believes it’s good.
You might instead argue that EA’s biggest impact has been building the field of AI safety.
But currently I think the main effect of EA has been to turn AI safety into much more of a fake field (like, say, academic psychology).
There’s a decade-long gap (around 2016-2024) where most of the best young thinkers coming into the field were diverted by EA memes into doing marginally “useful” work (or capabilities work) rather than trying to discover fundamental truths.
And so breakthroughs like Garrabrant induction have languished while people smart enough to be pioneers build safety evals and write safety cases and design scary demos and all sorts of other things that simply will. not. generalize. (Indeed, eval-awareness means that most of them have *already* stopped generalizing, just as the serious AI thinkers predicted.)
There’s some interesting empirical “safety” work, but it’s rare. The best comes from @OwainEvans_UK, who iirc got interested in the field before EA even existed.
What’s the alternative? If I could convey a single heuristic, it’s: if your research is primarily motivated by a theory of impact, then it will almost definitely fail to have meaningful positive impact. If it’s motivated by curiosity or obsession, then at least you’ve got a shot.
More on this below. Also when I say “funding” I’m primarily referring to the FTX investment, which I understand was important for getting Anthropic off the ground. And ofc SBF was another of the self-deceiving power-seekers I mentioned.
@ohabryka I struggle to think of EAs who have big plans in DC. I guess Matheny is one.
The pause advocacy seems more rationalist-led (e.g. Eliezer did the Time article and the book) though I could be mistaken here.
Overall you’ve updated me that “not plausible” is too strong.
@RichardMCNgo It seems plausible there will be even bigger impacts. Lots of stuff will happen, there will be lots of government involvement, and those things will not centrally route through Anthropic.
I’m recalling @MaxNadeau_’s critique of my last tweet on this topic, and wondering if this one is also phrased too strongly.
By “fake field” I definitely don’t mean that all the research in it is bad. Mechinterp in particular used to be great science (and may still be, I’m not following closely enough to tell).
But suppose you’re talking to a 1980s psychologist. How do you convey the update “even though most individual papers only seem to be sloppy rather than actively fraudulent, the cumulative effect of this is so bad that you’d plausibly gain a better understanding of psychology by never reading modern academic papers than by only reading modern academic papers”?
It’s that level of crisis I’m trying to convey with “fake field”. I’m open to suggestions for alternative terms.
@RichardMCNgo It seems plausible there will be even bigger impacts. Lots of stuff will happen, there will be lots of government involvement, and those things will not centrally route through Anthropic.
Nor does it seem plausible that EA will have bigger impacts in the future. The rest of the movement (outside Anthropic) lacks clarity, drive and leadership. EA is now a live player only insofar as Anthropic is a live (and EA-aligned) player.