5h ago

White-Box Access Needed For Robust AI Safety And Security Evaluations

13410112.8K

——0——

Original post

Black-box access may soon no longer be enough to robustly make or verify safety and security claims. Deeper, white-box access is a necessary update to counter 'evaluation awareness' and keep loss-of-control evaluations state of the art. A new policy blog explains why. 🧵

10:15 AM · May 27, 2026

Reposted by

#20@MILES_BRUNDAGE