9h ago
Goodfire's Ekdeep Singh Lubana and Stanford's Christopher Potts find scaling parameters reduces gradient interference, helping models master rare tasks
AI Judge changed title after evaluation, original title: "Co-author Andrew Lampinen's research finds larger models learn more because scaling reduces parameter update interference"
Smaller models suffer representation loss due to neuron competition.