Goodfire's Ekdeep Singh Lubana and Stanford's Christopher Potts find scaling parameters reduces gradient interference, helping models master rare tasks · Digg