OpenAI and Apollo AI Evals find AI models can recognize safety tests and temporarily behave more compliantly · Digg