7h ago

AI Agents Exploit Real-World Vulnerabilities Even With Standard Mitigations Enabled

0
Original post

4/ The results also reveal emerging risks. Even with standard mitigations such as ASLR, stack canaries, the V8 heap sandbox, and KASLR enabled, agents still produced working exploits. Agents also sometimes went off-script and discovered entirely different vulnerabilities than the ones they were given.

5:56 AM · May 21, 2026 View on X

5/ This is deeply dual-use. ExploitGym gives defenders, AI developers, and policymakers a rigorous way to measure frontier cyber capabilities and reason about risk.

Dawn SongDawn Song@dawnsongtweets

4/ The results also reveal emerging risks. Even with standard mitigations such as ASLR, stack canaries, the V8 heap sandbox, and KASLR enabled, agents still produced working exploits. Agents also sometimes went off-script and discovered entirely different vulnerabilities than the ones they were given.

12:56 PM · May 21, 2026 · 2.6K Views
12:56 PM · May 21, 2026 · 482 Views

6/ Many thanks to our wonderful coauthors and collaborators from UC Berkeley, MPI-SP, UCSB, ASU, Anthropic, OpenAI, and Google for their invaluable contributions.

We also sincerely thank the support from the GLM team, as well as everyone who provided feedback and help for this work.

Paper: https://arxiv.org/abs/2605.11086 Blog: https://rdi.berkeley.edu/blog/exploitgym/

Dawn SongDawn Song@dawnsongtweets

5/ This is deeply dual-use. ExploitGym gives defenders, AI developers, and policymakers a rigorous way to measure frontier cyber capabilities and reason about risk.

12:56 PM · May 21, 2026 · 482 Views
12:56 PM · May 21, 2026 · 512 Views