/Tech5h ago

ICML Mech Interp Workshop Draws 800+ Submissions Amid Quality Concerns

120113.3K

Original post

a) mech interp (a.k.a "how the #&@! do these models do what they do") is an incredibly interesting and important topic to study, regardless of "safety implications"

b) as a previous Area Chair for interpretability tracks, these are the worst tracks to review. all works are meh.

juliana@juli_li_

wondering why Mech Interp academia is growing so much faster than every other safety subfield (despite being relatively uncommon in industry AI safety teams).

i'm guessing it's partially due to low barrier to entry, hope this doesn't lead to too much publication slop farming

12:29 AM · Jun 13, 2026 · 2.7K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS582LIKES3

(((ل()(ل() 'yoav))))👾@yoavgo

the b above does not mean no works are good. actually most are decent. but nothing is very conclusive. there are no "benchmark" people compete over. everyone studies a slightly different topic, and all get some level of inconclusive answers. how do you pick one work over others?

(((ل()(ل() 'yoav))))👾@yoavgo

a) mech interp (a.k.a "how the #&@! do these models do what they do") is an incredibly interesting and important topic to study, regardless of "safety implications"

b) as a previous Area Chair for interpretability tracks, these are the worst tracks to review. all works are meh.

5h58230