/Tech3h ago

OpenAI's Adrien Ecoffet argues the AI safety community prioritized legislative battles over building actionable technical standards

Yo Shavit defended FLOP thresholds to protect academic models.

789882.1K

#173

Original post

Adrien Ecoffet@AdrienLE#946inTech

I agree that the safety community could have done a better job getting ready for the current situation instead focusing on the SB1047 culture war, but they will rightly say that at least they were trying something, so why should blame fall primarily on them when others were indifferent or actively working against them?

I'd actually love a blameless retro of how we ended up with a government that heavily regulates AI in an ad-hoc and chaotic way while the constituency that historically most wants AI regulated has had so little impact on how this regulation is happening that it outright opposes it.

Fascinating stuff with a lot to learn from, but I don't think the community has reached the stage of grief needed to look at these questions objectively yet.

Joshua Achiam@jachiam0

I strongly disagree and consider this in the territory of highly-motivated historical revisionism. The safety activists have had years to get their acts together on developing clear and comprehensive technical standards that could govern advanced AI release decisions. Useful ideas about the importance of specifications and standards swirled within the community for years. Did the Andreesens of the world help with this? Of course not. But the AI safety world was flush with unbelievable amounts of money and the best technical talent in the world. The failure to produce off-the-shelf decision-making guidelines at a level robust enough for government to adopt and have confidence in is wholly a failure of the AI safety community. The community also very much confused the state of affairs with poorly-conceived political fights like SB 1047, which tried to advance a standard based on training FLOPs instead of anything more meaningful. If AI safety activists can't take ownership for the current state of affairs pertaining to AI safety standards, then the whole field is bankrupt of humility and ideas.

10:43 AM · Jun 26, 2026 · 566 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS2KBOOKMARKS6LIKES82RETWEETS7REPLIES4

Yo Shavit@yonashav

Josh, I think you might be operating under some bad information.

This seems to be a misunderstanding of why FLOPs have been included in every attempt at safety legislation. There’s an inherent need to ask “to what types of models do you apply the standards”, lest we run cyber evals on every academic lab’s 50M param pretrain.

Also, it’s not that people didn’t propose standards. See eg Transluce’s draft work. But to get broad acceptance of specific standards, you would need the labs to be willing to agree, which they strongly preferred to avoid doing to not tie their future hands wrt regulatory constraints under conditions they couldn’t foresee. That’s why every lab safety framework is vague on exact threshold operationalization.

Joshua Achiam@jachiam0

2h2K826

Joshua Achiam@jachiam0

I am not operating from bad information nor from misunderstanding. I was directly on the inside of many of these tradeoffs. Some counterpoints:

FLOP thresholds have always been bad policy, period. One did not actually have to write policy that required a battery of tests on every model, and saying that FLOP thresholds were the needed remedy is wrong. FLOP thresholds do not correlate directly with capabilities, let alone dangerous capabilities. FLOPs required to reach the frontier gradually decrease over time as a result of algorithmic advances and this is well-known, but even writing policies that let you move FLOP thresholds down over time would not save you from the bigger issue: capabilities are not just a function of model training compute, they are a function of test time compute applied to the problem of interest, the specific capabilities you train for, and also other elements of the technical system they're connected to. These problems have been known since before 1047. They were ignored. This has had the effect of profoundly confusing the policy conversation about where and how to set thresholds. Different standards with fuzzier decision boundaries about the actual right targets could have been used. Why not "A model that might reasonably be expected to contribute to cyber attacks should be evaluated according to (etc. etc.)" instead of a model that clears a FLOP count? This would not have required the 50M pretrain to go through heavyweight testing. These types of issues were always obvious and these types of solutions were always plausible, but the AI safety community insisted that 1047-shaped policies were "good steps" anyway; defense of bad policy became a tribal shibboleth.

I also didn't say that people didn't propose standards. Of course people proposed standards. But nothing has become the clear unified policy that people agreed on. And importantly the AI safety community did not rally around a specific standard it wanted the government to clearly and consistently apply. The frontier model preparedness frameworks are the closest thing, and while they're good, for the reasons you outline - which I agree with - they are flexible and not specific standards.

My central point, which you did not actually offer a refutation of, is that blaming the Andreesens of the world for policy confusion in this moment is wrong. Are they profoundly unhelpful? Again, I say, indeed. But were they responsible for the level of chaos here? No, and the AI safety community perpetually deflecting to outside enemies - especially hated enemies - instead of considering whether it has made serious strategic errors over time is a real collective failure.

If you shout at the top of every roof for years - "The government has to do something! We need a policy to limit this tech! We're on an exponential and we're about to upend the world, national security will never be the same, someone step in and put guardrails on!" of course the government steps in at some point and says okay. And of course they'll do it wrong. They'll reliably do it wrong. Government doesn't have adequate technical capacity to understand the difference between ASI and a turboencabulator. The relevant question is how wrong will they be, and how much is ready to hand to them to help them understand; how prepared is the community to do diplomacy and engage.

The AI safety community failed to prepare for this moment. It set the scariest possible tone and message. It antagonized and heckled the administration at every turn. Was this reciprocated? Of course. Is this administration uniquely bad at diplomacy and uniquely commited to strongman tactics? Unquestionably. Should the AI safety community have rolled over like dogs? No, of course not. But does the AI safety community bear zero responsibility for a diplomatic outcome this bad? No, and everyone should be much more reasonable on this point. The AI safety community could have built a glide path towards actually good regulation that would have addressed the salient question here: to whom should what models be made available? What capabilities should be in the hands of the world as a whole, the American and allied universe in particular, or solely the government and the national security community? What powers should be held by the AI safety tribe, embodied in companies and nonprofits and think tanks and friendly govt institutions, versus with other people and agencies?

The failure to plan for these questions is a meaningful failure. The inability to reflect on this failure is a serious issue.

Some sharper, more pointed thoughts. The AI safety community often likes to pretend it is a total underdog, outmanned and outgunned, a small precious champion of light in a world of darkness, and so immune to the requirement to behave like an accountable and responsible power. It is not. It is a massive power with billions of dollars of ammunition and substantial influence and agenda ownership at the senior leadership level in all of the top labs. It exerts political power when it wants to flex the muscle, piling fellows into DC and think tanks, pouring millions into key congressional races. It has its own media ecosystem and news outlets; minor though they may be individually, in the aggregate substantial and with influence over key people in the space.

At some point it has got to reckon not with the thing it thinks itself to be, but with the thing it actually is, and the duties that come with that.

Yo Shavit@yonashav

Josh, I think you might be operating under some bad information.

12m36771

Yo Shavit@yonashav

In case you’re interested, the FMF has the closest thing to public guidance documents on frontier safety evaluation standards. https://www.frontiermodelforum.org/publications/#technical-reports They exist! They’re inherently not that detailed because there’s been no forcing function to get stakeholders to compromise on exact operationalization, plus an inherent need for flexibility given the rapid shifts in evaluation best practices.

2h795