I think we could've easily ended up in a world where labs published ~no analysis of the safety properties of their new models—instead labs often publish literally hundreds of pages per model! I'm really happy about this / this is an important norm to continue upholding.
Andy Jones questions whether AI safety reports are a consistent norm, citing declining page counts across recent models
Story Overview
Andy Jones flags a potential erosion of an early AI industry practice where labs released lengthy safety evaluations alongside new models. Examples include Fable's 319-page card, GPT-5.6's 77 pages, and Gemini 3's 26-page report, with Gemini 3.5 showing no comparable document so far. The observation leaves open whether shorter or absent reports reflect faster cycles or a fading emphasis on detailed transparency.
Documentation lengths signal uneven practices
Recent releases show page counts dropping from hundreds to dozens or zero, even as some labs like Meta still deliver 160-page reports. It remains unclear if this pattern will settle into a lighter standard or prompt renewed calls for consistency.
Release speed may test old transparency habits
The thread highlights how quicker model drops could compress the time available for exhaustive safety write-ups that once accompanied frontier launches. Observers are left watching whether the original norm holds or adapts under current timelines.
Users welcome AI labs publishing extensive model safety analyses because the detailed reports like 160 pages provide welcome transparency beyond prior ignorance.
No Digg Deeper questions have been answered for this story yet.
Most Activity
It is consequential that the Labs grew out of a small group who thought AGI was a real goal & that had particular views of risks. I think we would have had very different discussions about AI safety if the big advances in LLMs came from IBM’s Westchester research labs or whatever
I think we could've easily ended up in a world where labs published ~no analysis of the safety properties of their new models—instead labs often publish literally hundreds of pages per model! I'm really happy about this / this is an important norm to continue upholding.
@idavidrein is it a norm? would be very happy to be wrong about this, but:
fable model card: 319 pages gpt 5.6 model card: 77 pages gemini 3* safety report: 26 pages
[afaict there's no gemini 3.5 model card or safety report]
I think we could've easily ended up in a world where labs published ~no analysis of the safety properties of their new models—instead labs often publish literally hundreds of pages per model! I'm really happy about this / this is an important norm to continue upholding.
@idavidrein muse spark: 160 pages, nice!
@idavidrein is it a norm? would be very happy to be wrong about this, but:
fable model card: 319 pages gpt 5.6 model card: 77 pages gemini 3* safety report: 26 pages
[afaict there's no gemini 3.5 model card or safety report]

@andy_l_jones agreed there's a range—I'm mostly thinking that it could've easily been the case that the range was like 1-30 pages

@andy_l_jones @idavidrein idt num pages is necessarily a good measure of coverage

@idavidrein @andy_l_jones there's a pretty dynamic range outside of these labs though....

@andy_l_jones @idavidrein @rohinmshah talked about model cards not being the right format for paper-like depth on @80000Hours, at ~1h15m in

@idavidrein Agree that's super nice relative to complete ignorance.
But we could also have been in a world where we knew the algorithms, RL environments, and trailing data for models -- and the mysteries around Opus 3 / reward hacking / CoT intelligibility would be far easier to solve.

@andy_l_jones @idavidrein actually its probably a good proxy at this point, on second thought