/Tech4h ago

Anthropic proposes policy framework giving governments authority to block unsafe frontier AI model deployments

Ryan Greenblatt warns the plan neglects risky internal deployments.

4639224.3K
Original post
Thomas Woodside 馃珳@Thomas_Woodside

Anthropic has published a new policy framework for frontier AI. I鈥檓 happy to see it! Importantly, it takes seriously the need to sometimes stop AI companies from taking actions that pose a substantial risk of catastrophic harm. There are also some areas where it could be improved.

1:30 PM 路 Jun 10, 2026 路 3.6K Views
Sentiment

Positive users praise Anthropic's policy framework for frontier AI as a solid contribution with useful safety ideas, while negative users criticize it for enabling unchecked internal deployments that bypass oversight.

Pos
50.0%
Neg
50.0%
3 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS742REPLIES2
Ryan Greenblatt@RyanPGreenblatt

Policies like this would be good and could be an decent first step. That said, these policies don't seem sufficient.

One concern I have is that it's unclear whether blocking usage/deployment applies to internal deployment and I think most of the risk is from internal deployment.

Anthropic@AnthropicAI

Our Advanced AI Framework sets out how governments should prepare for and prevent catastrophic risks from frontier AI systems.

The government should have the authority to block or revoke the release of unsafe models, and invest in societal resilience. http://www.anthropic.com/policy-on-the-ai-exponential/aaif

2hViews 742Likes 7Bookmarks 0
BOOKMARKS3LIKES14
Thomas Woodside 馃珳@Thomas_Woodside

Finally, there are some notable areas where I think this framework could be improved (also not exhaustive): - It's not clear from reading the framework whether the government authority to "block or deter unsafe models" applies to models that are used only internally at the company. While the scope of the risk reports explicitly cover such deployments (good!) this section does not specify. But internal use might be one of the most important risk areas given risks from loss of control resulting from automated AI R&D. - It's unclear what information-collecting authority third party evaluators have. For example, there is no general authority to examine necessary documents (for example, company evaluation logs) or collect information outside of the risk report process. I think we should be moving closer to an embedded auditor model in this respect. - The regulator doesn't appear to have the flexibility to promulgate minimum standards for frontier AI frameworks or update any of the framework's key definitions over time. Given that legislatures often take time to act, I think it's important to include something like this in a framework.

1dViews 277Likes 14Bookmarks 3
RETWEETS2
Thomas Woodside 馃珳@Thomas_Woodside

It also includes some good ideas that haven't been enacted in state law that I think would be a positive step forward (not exhaustive): #1 Most significantly, the framework grapples directly with how the government could intervene to stop companies from taking dangerous actions. This includes if the evaluator finds "a significant risk of catastrophic harm." It is framed as just a "possible approach" and is fairly high level, but I think it's very positive to see Anthropic directly support a mechanism of this kind. #2 Mandatory risk reports. Several state laws have a very barebones version of this for risks resulting from internal deployment, but the version in Anthropic's framework is significantly more detailed, including a residual risk assessment after safeguards are applied. #3 Stronger audits ("independent evaluations") than are found in Illinois SB 315. The evaluator is supposed to assess overall levels of risk, not just whether the developer followed the processes that it said it would, as in SB 315. It also talks about exploring an accreditation system for evaluators and says evaluators could be randomly assigned to AI developers (instead of developers choosing whoever they want). #4 Explicitly including automated AI R&D as a risk factor. This is present in all of the frontier AI companies' frontier AI frameworks, but none of the state laws; it should be included in policy frameworks going forward.

1dViews 441Likes 13Bookmarks 2
Thomas Woodside 馃珳@Thomas_Woodside

The framework includes many well-established governance mechanisms present in state law, including: - Mandatory frontier AI frameworks. - Incident reporting. - Whistleblower protections. - Penalties for violations and false statements. - Third party audits.

1dViews 500Likes 14Bookmarks 1
Thomas Woodside 馃珳@Thomas_Woodside

This is a solid contribution to the policy landscape, and I'm glad Anthropic released it! I hope more companies put out proposals like this so their views can be publicly scrutinized. Recently, OpenAI also put out a policy framework which has some good points but is much more high level than Anthropic's.

https://www-cdn.anthropic.com/files/4zrzovbb/website/0a58d567024a8b448ff15158ebc3625328dfcc1f.pdf

https://openai.com/index/public-policy-agenda/

1dViews 271Likes 6Bookmarks 2
Rugbist@rugbist_

@RyanPGreenblatt this is the real gap nobody talks about. internal deployment with no oversight is basically a backdoor for all the risk theyre trying to regulate

2h