/Tech3h ago

Deep Linear Networks Show Nonlinear Optimization Dynamics Without Nonlinearities

427082.1K

Original post

i keep thinking back to deep linear nets theory and the importance of following a sequence of products in the chain rule... optimizing a deep linear network is still beneficially nonlinear in its dynamics even without nonlinearities...

2:18 AM · Jun 7, 2026 · 1.5K Views

/Tech3h ago

Deep Linear Networks Show Nonlinear Optimization Dynamics Without Nonlinearities

427082.1K

#1694

Original post

kalomaze@kalomaze#1694inTech

2:18 AM · Jun 7, 2026 · 1.5K Views

Sentiment

Users express surprise at the hidden nonlinearity driving optimization in deep linear networks, describing the finding as wild.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS596BOOKMARKS3LIKES7

kalomaze@kalomaze

nonlinearities are essentially just implementing piecewise linears that compose with depth SwiGLU was famously called "divine benevolence" but mechanistically you're just cutting out the implicit/"symbolic rule" gating of ReLU and making it directly differentiable via products

kalomaze@kalomaze

3h59673

Newtee@Newtlx

@kalomaze Deep linear nets show that order alone drives nonlinear behavior

Even without nonlinear activations the chain rule introduces dynamics that mimic real world learning inefficiencies showing fundamental limits in current scaling approaches that ignore structural order

3h5

Chestuits@Chestu_eth

@kalomaze That hidden nonlinearity in linear nets is wild

3h1