2h ago

Periodic Labs' Rohan Pandey jokes that backpropagation is the ultimate interpretability researcher, easily identifying and steering neuron behaviors

The joke critiques the complexity of modern mechanistic interpretability.

0
Original post

my favorite interp researcher can identify neurons responsible for any behavior and provide steering vectors for them her name is backprop and her steering vectors are just gradients

1:32 PM · May 29, 2026 View on X

@aryaman2020 wow my interp take has been aryaman approved lfg

Aryaman AroraAryaman Arora@aryaman2020

true

9:41 PM · May 29, 2026 · 2.1K Views
9:43 PM · May 29, 2026 · 686 Views

@ChengleiSi @aryaman2020 truly the goat

CLSCLS@ChengleiSi

@khoomeik my favorite interp researcher is @aryaman2020, he can identify neurons responsible for any behavior by just eyeballing the matrices

10:13 PM · May 29, 2026 · 310 Views
10:15 PM · May 29, 2026 · 178 Views

@khoomeik my favorite interp researcher is @aryaman2020, he can identify neurons responsible for any behavior by just eyeballing the matrices

Rohan PandeyRohan Pandey@khoomeik

my favorite interp researcher can identify neurons responsible for any behavior and provide steering vectors for them her name is backprop and her steering vectors are just gradients

8:32 PM · May 29, 2026 · 5.1K Views
10:13 PM · May 29, 2026 · 310 Views

@aryaman2020 related - I’m always curious why interpretability people design a cool new parameter efficient finetuning family like steering vectors and then choose not to optimize by gradient descent

Aryaman AroraAryaman Arora@aryaman2020

true

9:41 PM · May 29, 2026 · 2.1K Views
10:29 PM · May 29, 2026 · 42 Views

true

Rohan PandeyRohan Pandey@khoomeik

my favorite interp researcher can identify neurons responsible for any behavior and provide steering vectors for them her name is backprop and her steering vectors are just gradients

8:32 PM · May 29, 2026 · 5.1K Views
9:41 PM · May 29, 2026 · 2.1K Views

@khoomeik im a gradients guy https://arxiv.org/abs/2604.07615

Rohan PandeyRohan Pandey@khoomeik

@aryaman2020 wow my interp take has been aryaman approved lfg

9:43 PM · May 29, 2026 · 686 Views
9:46 PM · May 29, 2026 · 278 Views
Periodic Labs' Rohan Pandey jokes that backpropagation is the ultimate interpretability researcher, easily identifying and steering neuron behaviors · Digg