Just to reiterate the concern about focusing too much on pre-deployment testing for AI alignment/scheming testing:
In the immediately-pre-deployment AI testing paradigm, the model development team, to some approximation, cooks up the best model it can and then passes it to a safety testing team just before deployment. The safety testing team then runs some tests and decides whether the model is safe to deploy publicly or not.
For loss-of-control testing, this doesn’t really make sense, since the target you’re worried about is the AI lab itself! If anything, sharing the model with the world at least has a chance of transmitting information about the tendency of your models to scheme or sabotage, which could be useful for coordinating a response. If you were going to sit on a model, you'd want to sit on it before it was internally deployed at an AI company, not sit on it at the point of public deployment.