4h agoALTER founder David Manheim argues AI evaluations must be standardized as they become load-bearing for policy and safety— Eval Consensus uses a Delphi process to standardize evaluation reporting.——0——Original postOPSÓ#1457Seán Ó hÉigeartaigh|@S_OHEIGEARTAIGHGreat initiative to establish common understanding around best practices around conducting and reporting evaluations, as well as challenges to be overcome. V cool to see a delphi process used as part of this.11:17 AM · May 26, 2026 View on X