METR Researcher Clarifies Companies Lack Editorial Control Over Reports
We also offered the option to approve quotes *without attribution*. Through this mechanism, a company could disclose evidence that speaks to industry-wide issues (examples of reward hacking, lax controls, etc.), which they would individually have incentives not to publish.

Then we told each company what evidence we wanted to quote in our public report and negotiated any redactions. Our agreement carved out certain info, but generally they could block anything we couldn't have acquired ourselves (like by reading the news or evaluating public APIs).
Then we told each company what evidence we wanted to quote in our public report and negotiated any redactions. Our agreement carved out certain info, but generally they could block anything we couldn't have acquired ourselves (like by reading the news or evaluating public APIs).

As the first step in our process, we ran entity-level assessments for each company. These asked "What's the holistic picture of loss-of-control at [Company] at this point in time?", not "What's the risk from this specific [Company] model launch decision?".
As a backstop, we gave ourselves a carved-out right to publish a redaction summary indicator for each company. That sentence would let us flag whether a company insisted on a redaction we felt was material and blocked us from noting the specific redaction. No participant did.

We also offered the option to approve quotes *without attribution*. Through this mechanism, a company could disclose evidence that speaks to industry-wide issues (examples of reward hacking, lax controls, etc.), which they would individually have incentives not to publish.
Companies could exit from the pilot up to the point where they signed off on (redacted and/or anonymized) evidence, but not after. That meant companies knew exactly what non-public evidence we might cite, but couldn't directly control our downstream conclusions, framing, or tone.

As a backstop, we gave ourselves a carved-out right to publish a redaction summary indicator for each company. That sentence would let us flag whether a company insisted on a redaction we felt was material and blocked us from noting the specific redaction. No participant did.
You can find the non-public evidence companies approved in the back of the report. Appendix B covers stuff that companies were OK having attributed to them individually, and Appendix C aggregates statements from across companies. Later appendices also include some CoT excerpts.

Companies could exit from the pilot up to the point where they signed off on (redacted and/or anonymized) evidence, but not after. That meant companies knew exactly what non-public evidence we might cite, but couldn't directly control our downstream conclusions, framing, or tone.
This Frontier Risk Report is also the first time that we've used the AEF-1 standard from @aievalforum. I think it's important for organizations like METR to be transparent and accountable to the public ourselves, not just demand it of AI companies.

You can find the non-public evidence companies approved in the back of the report. Appendix B covers stuff that companies were OK having attributed to them individually, and Appendix C aggregates statements from across companies. Later appendices also include some CoT excerpts.