Expert Warns AI Executive Order Cyber Threshold May Cover All Models

VIEWS2.1K

We’re going to be having the same conversations we had wrt cyber on bio, on autonomy/R&D/RSI, etc in the next few months—all while major labs are looking to go public. We need a lot more agility in governance and a lot more visibility to know what to govern.

Divyansh Kaushik@dkaushik96

This just gets to how hard it is to design a mechanism to address just one kind of threat. I go back to my original worry that most policymakers aren’t registering what’s truly happening here (which is exactly why institutions like caisi are all the more important to give us visibility into what to prepare for).

1d2.1K71

BOOKMARKS2LIKES7RETWEETS1

Divyansh Kaushik@dkaushik96

What’s going to be our response on bio or autonomy? Are we willing to make hard decisions? We struggle with decision making already, and it’s getting harder to get things right in the first go.

Things will get weird and only about 200 or so people in DC are paying attention.

Divyansh Kaushik@dkaushik96

We’re going to be having the same conversations we had wrt cyber on bio, on autonomy/R&D/RSI, etc in the next few months—all while major labs are looking to go public. We need a lot more agility in governance and a lot more visibility to know what to govern.

1d1.2K72

REPLIES1

Divyansh Kaushik@dkaushik96

And then the issue that this threshold is per-model. Microsoft’s MDASH found more vulnerabilities than any single frontier model by orchestrating hundreds of smaller ones together but none of those would be covered individually. So we leave a major aspect of the threat space uncovered.

Divyansh Kaushik@dkaushik96

Just on cyber first, the EO is a good step but I’m a little worried about the benchmarking that NSA has to do to define a covered model by cyber capability. Say we set the bar at X, calibrated to today’s defenses. In 6-12 months X catches every model released (as defenses lag but other labs release capable models). Yes, X moves up eventually but in the near term we risk making every model covered. We may end up building a threshold that is no longer useful.

1d48721

Divyansh Kaushik@dkaushik96

This just gets to how hard it is to design a mechanism to address just one kind of threat. I go back to my original worry that most policymakers aren’t registering what’s truly happening here (which is exactly why institutions like caisi are all the more important to give us visibility into what to prepare for).

Divyansh Kaushik@dkaushik96

All that said a model’s capability scales with how much test-time compute you give it. Hand a weaker, uncovered model far more compute and it may find exploits the benchmark never priced in. So what are we actually measuring? The line moves the moment someone spends more.

1d24840

Divyansh Kaushik@dkaushik96

All that said a model’s capability scales with how much test-time compute you give it. Hand a weaker, uncovered model far more compute and it may find exploits the benchmark never priced in. So what are we actually measuring? The line moves the moment someone spends more.

Divyansh Kaushik@dkaushik96

And then the issue that this threshold is per-model. Microsoft’s MDASH found more vulnerabilities than any single frontier model by orchestrating hundreds of smaller ones together but none of those would be covered individually. So we leave a major aspect of the threat space uncovered.

1d14720