/AI3h ago

Deep Linear Networks Show Nonlinear Optimization Dynamics Without Nonlinearities

427082.1K
Original post
kalomaze@kalomaze#839inAI

i keep thinking back to deep linear nets theory and the importance of following a sequence of products in the chain rule... optimizing a deep linear network is still beneficially nonlinear in its dynamics even without nonlinearities...

2:18 AM · Jun 7, 2026 · 1.5K Views
Sentiment

Users express excitement about the hidden nonlinearity in linear networks shown by the study, calling the finding wild.

Pos
100.0%
Neg
0.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS606BOOKMARKS3LIKES7
kalomaze@kalomaze

nonlinearities are essentially just implementing piecewise linears that compose with depth SwiGLU was famously called "divine benevolence" but mechanistically you're just cutting out the implicit/"symbolic rule" gating of ReLU and making it directly differentiable via products

kalomaze@kalomaze

i keep thinking back to deep linear nets theory and the importance of following a sequence of products in the chain rule... optimizing a deep linear network is still beneficially nonlinear in its dynamics even without nonlinearities...

3hViews 606Likes 7Bookmarks 3
Newtee@Newtlx

@kalomaze Deep linear nets show that order alone drives nonlinear behavior

Even without nonlinear activations the chain rule introduces dynamics that mimic real world learning inefficiencies showing fundamental limits in current scaling approaches that ignore structural order

3hViews 5
Chestuits@Chestu_eth

@kalomaze That hidden nonlinearity in linear nets is wild

3hViews 1