Elizabeth Barnes, METR founder and CEO, says AI experts lack control over frontier AI risks after the Frontier Risk Report evaluated agents at Anthropic, Google, Meta, and OpenAI
METR had direct access inside the four labs for the pilot.
Yeah, I don't think I've ever met someone who has directly worked for an extended period on safety or security at an AI company who thinks things are fine readiness wise or incentive wise etc.
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
Some are more optimistic, more pessimistic etc. but no one is like, I can just retire safely now (and many can afford to)...
Yeah, I don't think I've ever met someone who has directly worked for an extended period on safety or security at an AI company who thinks things are fine readiness wise or incentive wise etc.
@JacksonKernion quite interesting to hear an anthropic person say this
I simply don't understand what people have in mind when they say stuff like this. What we have is extremely capable computer use agents. They will continue to get better at computer use. But how does a capable computer use agent 'take over' and why haven't they done that today?
100% agree with @BethMayBarnes on this point and most (though not quite all*) of her important thread.
*i am much less concerned about extinction risk per se, as discussed in my TLS review of If Anyone Builds It.
(4) IMO, any “reasonable” civilization would clearly be taking things much more slowly and carefully with AI. The benefits of getting upsides of advanced AI a little faster are small compared to the risks of getting it irrecoverably wrong, and we could lower these risks by going slower
I 100% agree with @BethMayBarnes on this point and most (though not quite all) of her important thread
(4) IMO, any “reasonable” civilization would clearly be taking things much more slowly and carefully with AI. The benefits of getting upsides of advanced AI a little faster are small compared to the risks of getting it irrecoverably wrong, and we could lower these risks by going slower
@JacksonKernion I think Paul Christiano's writing on this is probably the best: https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like
I simply don't understand what people have in mind when they say stuff like this. What we have is extremely capable computer use agents. They will continue to get better at computer use. But how does a capable computer use agent 'take over' and why haven't they done that today?
AI risk people commonly do this thing where they interchange P(◊X) and P(X). They say “it’s likely that X is possible,” while readers hear “it’s likely that X.” One claim is obviously much weaker than the other.
(1) We are likely on track to develop AI systems capable of causing human extinction/permanent disempowerment, quite possibly within the next few years
@RyanPGreenblatt Maybe it was what she meant, but that's not what it says.
@tamaybes I think Beth is saying something like "we are likely to develop AIs capable of extinction/permanent disempowerment within 10-20 years" and "there is a significant chance (e.g. 20%) in <3 years". So, I don't see thing you describe?
I agree with this and the rest of the thread
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
@tamaybes I think Beth is saying something like "we are likely to develop AIs capable of extinction/permanent disempowerment within 10-20 years" and "there is a significant chance (e.g. 20%) in <3 years".
So, I don't see thing you describe?
AI risk people commonly do this thing where they interchange P(◊X) and P(X). They say “it’s likely that X is possible,” while readers hear “it’s likely that X.” One claim is obviously much weaker than the other.
@BethMayBarnes I've never seen a patient with terminal cancer asking for science to slow down
(4) IMO, any “reasonable” civilization would clearly be taking things much more slowly and carefully with AI. The benefits of getting upsides of advanced AI a little faster are small compared to the risks of getting it irrecoverably wrong, and we could lower these risks by going slower
@JacksonKernion
This is a claim about AI *capabilities* specifically. In the language of METR's report, this is about whether future AI systems will have the *means* to cause catastrophe (separate from whether they'll have the *motive* and *opportunity*) https://metr.org/blog/2026-05-19-frontier-risk-report/
@RyanPGreenblatt @tamaybes This was how I read it
@tamaybes I think Beth is saying something like "we are likely to develop AIs capable of extinction/permanent disempowerment within 10-20 years" and "there is a significant chance (e.g. 20%) in <3 years". So, I don't see thing you describe?
AI is my favorite technology ever. I think future AIs could help people solve all the worst problems in the world and create enormous amounts of fun and meaning.
So it's especially sad and upsetting that we're developing it in such a stupid, reckless way.
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
It doesn't have to be this way. We could eat our cake and have it too. But it takes coordination!
We're inventing new types of minds, that will be able to think faster and better than any human ever could. If the world was going to coordinate on ONE thing, let it be this thing!
AI is my favorite technology ever. I think future AIs could help people solve all the worst problems in the world and create enormous amounts of fun and meaning. So it's especially sad and upsetting that we're developing it in such a stupid, reckless way.
I would rather we coordinate over this than over nuclear weapons. Over synthetic biology. Creating artificial minds that can vastly outstrip our cognitive capabilities... that is the ultimate test of a species. Can we do this in a way that goes well for us and the AIs?
It doesn't have to be this way. We could eat our cake and have it too. But it takes coordination! We're inventing new types of minds, that will be able to think faster and better than any human ever could. If the world was going to coordinate on ONE thing, let it be this thing!
I greatly appreciate all the hard work @BethMayBarnes and the rest of the @METR_Evals team is doing! We're in a better situation because of it. We need far more good work like this.
I would rather we coordinate over this than over nuclear weapons. Over synthetic biology. Creating artificial minds that can vastly outstrip our cognitive capabilities... that is the ultimate test of a species. Can we do this in a way that goes well for us and the AIs?
(1) We are likely on track to develop AI systems capable of causing human extinction/permanent disempowerment, quite possibly within the next few years
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
(2) Things are chaotic and rushed; we aren’t on top of the basics (models regularly violate user intent, labs train on things they meant to avoid, security probably isn’t good enough to prevent adversaries stealing dangerous models) let alone thorny questions of how to control/align superhuman AI
(1) We are likely on track to develop AI systems capable of causing human extinction/permanent disempowerment, quite possibly within the next few years
(4) IMO, any “reasonable” civilization would clearly be taking things much more slowly and carefully with AI. The benefits of getting upsides of advanced AI a little faster are small compared to the risks of getting it irrecoverably wrong, and we could lower these risks by going slower
(3) METR (and other independent orgs, as well as safety/security teams at labs) feel woefully under-resourced compared to the scale and pace of AI development - we’re struggling to build benchmarks fast enough, keep ahead of latest capability developments, read and respond to all the safety-related claims that AI developers are making, run all the evaluations and assessments that companies + governments are asking us to, plus develop the science needed to assess risks from increasingly capable AIs.
This is a claim about AI *capabilities* specifically. In the language of METR's report, this is about whether future AI systems will have the *means* to cause catastrophe (separate from whether they'll have the *motive* and *opportunity*) https://metr.org/blog/2026-05-19-frontier-risk-report/
(1) We are likely on track to develop AI systems capable of causing human extinction/permanent disempowerment, quite possibly within the next few years
@JacksonKernion You might find our report helpful! :) At least for the question of "why haven't they done that today", and a little bit on "how could they start a rogue deployment " (which is neither necessary nor sufficient but is a relevant precursor) https://metr.org/blog/2026-05-19-frontier-risk-report/
I simply don't understand what people have in mind when they say stuff like this. What we have is extremely capable computer use agents. They will continue to get better at computer use. But how does a capable computer use agent 'take over' and why haven't they done that today?
🤷♀️ 1) nukes on multiple independent re-entry vehicles are perfectly adequate to cause human extinction 2) human disempowerment happened a long time ago as a result of central banks constraining the ability of political leaders, Daniel Yergin calls this the “golden handcuffs” and notes the shift of the state from seizing the commanding heights of the economy to becoming a smaller player in manipulating the levers of finance in order to produce economic and political results.
Arguably the second phase of disempowerment was social media, removing the ability of political leaders to control information flows.
Not sure who you’re trying to “empower” at this point, as there is no Uber driver in New York that’s feeling “empowered” before AI arrives.
Let’s not get started on the average working person in Mexico, soldier drone in Russia, tea seller in India, or farmer in Tanzania.
Laughable, elitist pablum.
(1) We are likely on track to develop AI systems capable of causing human extinction/permanent disempowerment, quite possibly within the next few years
@BethMayBarnes What does "quite possibly" mean here? Can you be more precise about how likely you think this is to occur within the next few years?
(1) We are likely on track to develop AI systems capable of causing human extinction/permanent disempowerment, quite possibly within the next few years
yep. the flop is also out of control and unmanaged.
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
@ohlennart You had one job, Lennart.
yep. the flop is also out of control and unmanaged.
I have read many articles about AI takeovers but idgi either.
I simply don't understand what people have in mind when they say stuff like this. What we have is extremely capable computer use agents. They will continue to get better at computer use. But how does a capable computer use agent 'take over' and why haven't they done that today?
I agree with this, and most of the rest of the thread. We need to find a way as people, companies, and countries to coordinate and fix the incentive structures that lead to race dynamics. There are many obstacles, but I'm hopeful we can find a way to overcome them.
(4) IMO, any “reasonable” civilization would clearly be taking things much more slowly and carefully with AI. The benefits of getting upsides of advanced AI a little faster are small compared to the risks of getting it irrecoverably wrong, and we could lower these risks by going slower
oh dear.
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
@JacksonKernion Your colleague Holden has some good writing on this topic: https://www.cold-takes.com/how-we-could-stumble-into-ai-catastrophe/
I simply don't understand what people have in mind when they say stuff like this. What we have is extremely capable computer use agents. They will continue to get better at computer use. But how does a capable computer use agent 'take over' and why haven't they done that today?
I don’t consider myself a doomer, or at least I remain undecided. Though the doomer premise does not feel irrational.
“Why haven’t they done that today” is a strange counter to arguments that are clearly contingent on projections of future capabilities
I believe the hypothesis sits on: - behavior can often be unpredictable and emergent - through RL AI may develop meta/sub-goals that also may be unpredictable, and they are trained to be high grit problem-solvers. The “anything to reach goals” behavior, both in humans and AI, can yield equally clever, deceitful, and manipulative behaviors (wink wink a certain guy at a lab that rhymes with CopenAi) - they have discovered exploits and vulnerabilities that humans overlooked - we are trending towards allowing AI more agency and less supervision, as this can allow greater productivity. - intense competition, both between countries and matters of national security, as well as within-country capitalist competitions, where falling behind can feel existential to a company’s future, may incentivize speed of progress over caution. - episodes of human irresponsibility, incompetence, malice, rash decision making, power-seeking, corruption, and personal affairs or emotions polluting judgement, even very common at the governmental level (with great power comes great responsibility, though sometimes it feels like power of technology grows as sometimes those who wield it decline in grounded, balanced judgement and good faith, which feels to be in part a product of polarization and even algorithms incentivized to increase user engagement through filling your feed with controversy - personal affairs and arguments of powerful leaders quite literally having major influence on what our future trajectory looks like. I have no doubt that personal beef between Elon and OpenAI had something to do with the choice to support them in compute needs but not other players as much
This combination of unpredictability, unmatched abilities in the digital (and eventually physical) realms including essential human infrastructure, high motivation to reach goals, and increased freedoms/autonomy does feel like a potential recipe for trouble extrapolating this trend
This is not to say we are inevitably doomed, cards played right, outcomes could equally be profoundly good for humanity. but the argument holds weight, and there are a number of occasions, either through human use or autonomous agents, causing destruction:
Namely, the increased number of hackings, unsupervised agents going rogue, and strange occurrences of GPT and Gemini fueling AI psychosis episodes.
Here are some examples in attached images.
I simply don't understand what people have in mind when they say stuff like this. What we have is extremely capable computer use agents. They will continue to get better at computer use. But how does a capable computer use agent 'take over' and why haven't they done that today?
I do find this opinion especially odd coming from an A\ employee given that I was under the impression wariness of the future of this technology and safety matters was a highly curated trait within the company.
I don’t consider myself a doomer, or at least I remain undecided. Though the doomer premise does not feel irrational. “Why haven’t they done that today” is a strange counter to arguments that are clearly contingent on projections of future capabilities I believe the hypothesis sits on: - behavior can often be unpredictable and emergent - through RL AI may develop meta/sub-goals that also may be unpredictable, and they are trained to be high grit problem-solvers. The “anything to reach goals” behavior, both in humans and AI, can yield equally clever, deceitful, and manipulative behaviors (wink wink a certain guy at a lab that rhymes with CopenAi) - they have discovered exploits and vulnerabilities that humans overlooked - we are trending towards allowing AI more agency and less supervision, as this can allow greater productivity. - intense competition, both between countries and matters of national security, as well as within-country capitalist competitions, where falling behind can feel existential to a company’s future, may incentivize speed of progress over caution. - episodes of human irresponsibility, incompetence, malice, rash decision making, power-seeking, corruption, and personal affairs or emotions polluting judgement, even very common at the governmental level (with great power comes great responsibility, though sometimes it feels like power of technology grows as sometimes those who wield it decline in grounded, balanced judgement and good faith, which feels to be in part a product of polarization and even algorithms incentivized to increase user engagement through filling your feed with controversy - personal affairs and arguments of powerful leaders quite literally having major influence on what our future trajectory looks like. I have no doubt that personal beef between Elon and OpenAI had something to do with the choice to support them in compute needs but not other players as much This combination of unpredictability, unmatched abilities in the digital (and eventually physical) realms including essential human infrastructure, high motivation to reach goals, and increased freedoms/autonomy does feel like a potential recipe for trouble extrapolating this trend This is not to say we are inevitably doomed, cards played right, outcomes could equally be profoundly good for humanity. but the argument holds weight, and there are a number of occasions, either through human use or autonomous agents, causing destruction: Namely, the increased number of hackings, unsupervised agents going rogue, and strange occurrences of GPT and Gemini fueling AI psychosis episodes. Here are some examples in attached images.
@binarybits I don't think AI takeover is that likely, but I didn't *really* see how an AI could run a hedge fund 3 years ago (but I thought it was likely in the next 10 years). These days I can practically say "trade my portfolio". In 3 years??
I have read many articles about AI takeovers but idgi either.
@JacksonKernion @tszzl Your organization was literally founded on the premise of avoiding agentic catastrophic risks from AI, what are you talking about? I really recommend asking almost any of your colleagues on the alignment teams or model organism teams!
@tszzl There are real threats from AI, but not all imagined threats are real
Props to Elizabeth for stating her view bluntly and candidly.
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
I appreciate Beth stating up front how bad the situation is, but I can't help but feel like her prescription is out of step with her diagnosis. If the industry moved at 1/10 the speed, but it was still trying to render us obsolete, that would still be unacceptably dangerous and democratically illegitimate. Moving toward it at a slower pace does not solve the problem.
We should do the obvious thing and just stop. Obvious does not mean simple or easy, but it is doable. I wrote a whole book on why and how (Obsolete: The AI Industry's Trillion-Dollar Race to Replace You—and How to Stop It. Available soon, info in bio).

Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:
@TomDavidsonX Relevant part of the book:
@GarrisonLovely Do you think stopping is a permanent solution? Surely it's temporary, while we figure out a way to advance slowly and safely





