Periodic Labs' Rohan Pandey jokes that backpropagation is the ultimate interpretability researcher, using standard gradients to find and steer critical neurons · Digg