4h ago

GoodfireAI shares research showing sparse autoencoders tile and shatter curved neural manifolds in large models rather than linear directions, recasting unsupervised discovery as an inverse Ising problem

Visualizations map features across a manifold spanning 1800 to 1998.

236647637051.9K

——0——

Original post

#908@CSPROFKGDOP

Thomas Fel@THOMAS_FEL_

How do SAEs capture concept manifolds? 🍩 I think this is important work. we study how SAEs handle the geometric structures we've identified and find they tile/shatter them in a particular way we characterize, letting us recast unsupervised manifold discovery as inverse Ising

8:54 AM · May 21, 2026

Reposted by

#1626@BANBURISMUS_

#908@CSPROFKGD

QUOTE POST

#1329Mor Geva@MEGAMOR2

@CFGeek is right! This is exactly what we tried to do with MFA:

Charles Foster@CFGeek

The next step here is to go for direct unsupervised recovery of feature geometry from activations, rather than this two-step SAE → clustering stuff.

5:05 PM · May 21, 2026 · 1.2K Views

7:16 PM · May 21, 2026 · 291 Views

#1329Mor Geva@MEGAMOR2

@CFGeek My student @OrShafran has made this attempt actually: https://arxiv.org/abs/2602.02464 Works pretty good

Charles Foster@CFGeek

The next step here is to go for direct unsupervised recovery of feature geometry from activations, rather than this two-step SAE → clustering stuff.

5:05 PM · May 21, 2026 · 1.2K Views

7:04 PM · May 21, 2026 · 103 Views

QUOTE POST

#1356Charles Foster@CFGEEK

The next step here is to go for direct unsupervised recovery of feature geometry from activations, rather than this two-step SAE → clustering stuff.

5:05 PM · May 21, 2026 · 1.2K Views

QUOTE POST

#1626Tom McGrath@BANBURISMUS_

we have an automatic shape-finder, so you can find the shapes your model thinks in and receive twitter clout.

code is open-source, or you can also get silico to find your manifolds for you

Goodfire@GoodfireAI

The most popular way to interpret AI is missing the bigger picture. Models think in curved shapes. But sparse autoencoders (SAEs) work with straight lines. Can they still capture models’ curved neural geometry? Yes, but not how you might think! (1/7)

3:45 PM · May 21, 2026 · 39.2K Views

4:21 PM · May 21, 2026 · 838 Views

QUOTE POST

#1789Ekdeep Singh Lubana@EKDEEPL

Super excited to have this paper finally out! So many nuggets here, but a critical highlight: you should *not* interpret SAE features in isolation. The population geometry is where it's all at! Similar to this image of us @GoodfireAI folks playing out the elephant parable. :P

Goodfire@GoodfireAI

3:45 PM · May 21, 2026 · 39.2K Views

4:13 PM · May 21, 2026 · 3.2K Views

QUOTE POST

#1789Ekdeep Singh Lubana@EKDEEPL

For the physics bros: if you think of SAE features as mere on-off switches, that oughta remind you of Ising models. You can use this to unsupervisedly discover manifolds from SAE activities! Check out code link below!

4:28 PM · May 21, 2026 · 3.6K Views

GoodfireAI shares research showing sparse autoencoders tile and shatter curved neural manifolds in large models rather than linear directions, recasting unsupervised discovery as an inverse Ising problem

Cluster engagement

Sentiment