/Tech4h ago

Andy Jones questions whether AI safety reports are a consistent norm, citing declining page counts across recent models

Story Overview

Andy Jones flags a potential erosion of an early AI industry practice where labs released lengthy safety evaluations alongside new models. Examples include Fable's 319-page card, GPT-5.6's 77 pages, and Gemini 3's 26-page report, with Gemini 3.5 showing no comparable document so far. The observation leaves open whether shorter or absent reports reflect faster cycles or a fading emphasis on detailed transparency.

7944910.3K

#184

Original post

david rein@idavidrein#954inTech

I think we could've easily ended up in a world where labs published ~no analysis of the safety properties of their new models—instead labs often publish literally hundreds of pages per model! I'm really happy about this / this is an important norm to continue upholding.

11:33 AM · Jul 3, 2026 · 9.1K Views

Open Question

Documentation lengths signal uneven practices

Recent releases show page counts dropping from hundreds to dozens or zero, even as some labs like Meta still deliver 160-page reports. It remains unclear if this pattern will settle into a lighter standard or prompt renewed calls for consistency.

Industry Shift

Release speed may test old transparency habits

The thread highlights how quicker model drops could compress the time available for exhaustive safety write-ups that once accompanied frontier launches. Observers are left watching whether the original norm holds or adapts under current timelines.

Sentiment

Users welcome AI labs publishing extensive model safety analyses because the detailed reports like 160 pages provide welcome transparency beyond prior ignorance.

Pos

100.0%

Neg

0.0%

3 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

Ethan Mollick@emollick

It is consequential that the Labs grew out of a small group who thought AGI was a real goal & that had particular views of risks. I think we would have had very different discussions about AI safety if the big advances in LLMs came from IBM’s Westchester research labs or whatever

david rein@idavidrein

2h84800

BOOKMARKS1LIKES12REPLIES3

andy jones@andy_l_jones

@idavidrein is it a norm? would be very happy to be wrong about this, but:

fable model card: 319 pages gpt 5.6 model card: 77 pages gemini 3* safety report: 26 pages

[afaict there's no gemini 3.5 model card or safety report]

david rein@idavidrein

2h420121

andy jones@andy_l_jones

@idavidrein muse spark: 160 pages, nice!

andy jones@andy_l_jones

@idavidrein is it a norm? would be very happy to be wrong about this, but:

fable model card: 319 pages gpt 5.6 model card: 77 pages gemini 3* safety report: 26 pages

[afaict there's no gemini 3.5 model card or safety report]

2h12540

david rein@idavidrein

@andy_l_jones agreed there's a range—I'm mostly thinking that it could've easily been the case that the range was like 1-30 pages

2h801

Tim Kostolansky@thkostolansky

@andy_l_jones @idavidrein idt num pages is necessarily a good measure of coverage

2h221

🎭@deepfates

@idavidrein @andy_l_jones there's a pretty dynamic range outside of these labs though....

2h511

Tim Kostolansky@thkostolansky

@andy_l_jones @idavidrein @rohinmshah talked about model cards not being the right format for paper-like depth on @80000Hours, at ~1h15m in

2h471

1a3orn@1a3orn

@idavidrein Agree that's super nice relative to complete ignorance.

But we could also have been in a world where we knew the algorithms, RL environments, and trailing data for models -- and the mysteries around Opus 3 / reward hacking / CoT intelligibility would be far easier to solve.

18m211

Tim Kostolansky@thkostolansky

@andy_l_jones @idavidrein actually its probably a good proxy at this point, on second thought

2h101