4h ago

ALTER founder David Manheim argues AI evaluations must be standardized as they become load-bearing for policy and safety

Eval Consensus uses a Delphi process to standardize evaluation reporting.

0
Original post

Great initiative to establish common understanding around best practices around conducting and reporting evaluations, as well as challenges to be overcome. V cool to see a delphi process used as part of this.

11:17 AM · May 26, 2026 View on X