3h ago

Will Depue proposes a distance-based training penalty for weight interpretability, which Ashwinee Panda critiques as heavy-handed

Standard training typically scrambles weight matrices learning identity functions.

5520265.3K

——0——

Original post

random cute idea: make models more visually interpretable by adding a tiny ‘distance‘ penalty which encourages connecting close in row space. if you look at a weight matrix that learns the identity, everything is usually scrambled hmm what other cute penalties are there

7:30 AM · May 28, 2026

#1365Ashwinee Panda@PANDAASHWINEE

@willdepue auxiliary losses are a butcher’s move

will depue@willdepue

2:30 PM · May 28, 2026 · 5.2K Views

3:22 PM · May 28, 2026 · 197 Views

Will Depue proposes a distance-based training penalty for weight interpretability, which Ashwinee Panda critiques as heavy-handed

Cluster engagement

Sentiment