/AI5h ago

Will Depue, who worked on OpenAI's Sora, says trillion-parameter neural networks can be successfully supervised with only 16 bits of information

Charles Foster attributes this to batch and sequence dimensions.

81072166.9K
Original post
will depue@willdepue#249inAI

it is truly amazing we can supervise a trillion parameters from 16 bits of information. the fact that credit assignment across such a massive & discontinuous network is such a non-issue deserves so much reflection

5:21 PM · Jun 8, 2026 · 6.6K Views
Sentiment

Users note that catastrophic forgetting becomes almost a non-issue at scale for trillion-parameter models supervised with just 16 bits.

Pos
100.0%
Neg
0.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS303LIKES13REPLIES1

@willdepue @lu_sichu *batch/sequence dimension voice*

will depue@willdepue

it is truly amazing we can supervise a trillion parameters from 16 bits of information. the fact that credit assignment across such a massive & discontinuous network is such a non-issue deserves so much reflection

3hViews 303Likes 13Bookmarks 0
will depue@willdepue

@CFGeek @lu_sichu lol

3hViews 48Likes 1
Henry Dowling@henrytdowling

@willdepue why do you think that is? high dimensions just dont really have that many local minima so it will "eventually work"?

4hViews 86
Santosh Mohan@theycallmeMohan

@willdepue Catastrophic forgetting almost a non-issue as well at scale!

3hViews 52
Cliff Lattner@CliffLattner

@willdepue 16 bits is a metric fuckton

3hViews 28