Will coding agents take opportunities to undermine safeguards designed to oversee them?
We tackle this with automated auditing using simulated agentic environments, and scheming honeypot evaluations based on real internal alignment research codebases. Read more in our blog post