Cybersecurity Evals For AI Agents Use Long Horizons And Rigorous Verification · Digg