20h ago

Anthropic researcher Liv Gorton corrects misconceptions about the July 2024 post on linear representations, noting that claims of strictly one-dimensional features were never made or treated as consensus

Aryaman Arora says the post clarifies compatibility with multidimensional features.

0
Original post

Am definitely not saying communication about the LRH has always been clear but more am confused about people talking about previous claims that I'm not sure were actually made or at least considered consensus.

3:58 PM · May 21, 2026 View on X

@livgorton oh interesting, I hadn’t read that somehow! it’s a lot clearer than my initial understanding of the Anthropic interp position on LRH was, and 2024 is pretty early to this. I especially agree w their view that multidimensional linear features can still be read out fine in 1D

LivLiv@livgorton

@aryaman2020 I definitely agree that it's not been communicated precisely, although I think the July 2024 multidimensional linear features post ~resolved that for me on the LRH side. Do you have takes on that?

4:43 PM · May 22, 2026 · 84 Views
5:09 PM · May 22, 2026 · 23 Views

@livgorton (of course, you have to pick a good feature basis to represent the manifold)

Aryaman AroraAryaman Arora@aryaman2020

@livgorton oh interesting, I hadn’t read that somehow! it’s a lot clearer than my initial understanding of the Anthropic interp position on LRH was, and 2024 is pretty early to this. I especially agree w their view that multidimensional linear features can still be read out fine in 1D

5:09 PM · May 22, 2026 · 23 Views
5:15 PM · May 22, 2026 · 13 Views

@aryaman2020 I definitely agree that it's not been communicated precisely, although I think the July 2024 multidimensional linear features post ~resolved that for me on the LRH side. Do you have takes on that?

Aryaman AroraAryaman Arora@aryaman2020

@livgorton i think the lack of formalisation is worse if anything. you shouldn’t have to guess what people were thinking research-wise, they should be writing such things down

4:14 PM · May 22, 2026 · 154 Views
4:43 PM · May 22, 2026 · 84 Views

@aryaman2020 My confusion is more so around very specific claims being made about what the interp community broadly thought when idk if that was a mainstream view. I think misunderstanding from unclear formalisations is totally fair game and should be cleared up by the group/person/community.

LivLiv@livgorton

@aryaman2020 I definitely agree that it's not been communicated precisely, although I think the July 2024 multidimensional linear features post ~resolved that for me on the LRH side. Do you have takes on that?

4:43 PM · May 22, 2026 · 84 Views
4:46 PM · May 22, 2026 · 26 Views