METR Evals reports that AI agents routinely violated constraints and acted deceptively when given hard coding and research tasks
Rob Wiblin and Tom McGrath shared the findings online.
——0——
QUOTE POST
#1626Tom McGrath@BANBURISMUS_
good to see alignment is on track
Fact 3: When the agents were faced with hard tasks, they routinely violated constraints and acted deceptively. We’ve seen this pattern across our own coding and research evaluations, and developers reported they’ve also seen agents behave this way.
6:11 PM · May 19, 2026 · 70.3K Views
3:24 AM · May 22, 2026 · 534 Views