/Tech40d ago

GoodfireAI shares research showing sparse autoencoders tile and shatter curved neural manifolds in large models rather than linear directions, recasting unsupervised discovery as an inverse Ising problem

Visualizations map features across a manifold spanning 1800 to 1998.

801.9K2071K202.6K

#904

Original post

Lee Sharkey

Goodfire@GoodfireAI#1434inTech

The most popular way to interpret AI is missing the bigger picture.

Models think in curved shapes. But sparse autoencoders (SAEs) work with straight lines.

Can they still capture models’ curved neural geometry? Yes, but not how you might think! (1/7)

Goodfire@GoodfireAI

Neural networks might speak English, but they think in shapes.

Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision.

Starting today, we’re releasing a series of posts on this research agenda. 🧵

8:45 AM · May 21, 2026 · 128.2K Views

Sentiment

Many users praised GoodfireAI's research on sparse autoencoders capturing curved neural geometry for opening new possibilities in feature engineering and LLM communication, while a few noted manifold limitations or crediting concerns.

Pos

92.6%

Neg

7.4%

12 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

ARXIV.ORGVia

#1374

Posts from X

Most Activity

VIEWS23.5KLIKES132

Goodfire@GoodfireAI

So instead of interpreting features in isolation, what if we searched for features that act together?

We turned this idea into an unsupervised pipeline to cluster SAE features based on firing patterns.

Together, a cluster of features reveals the overall geometry. (6/7)

Goodfire@GoodfireAI

This helps explain why SAEs can feel both illuminating and unsatisfying!

Looking at SAE features one-by-one is like trying to understand the proverbial elephant by talking with each of the blind men: each label may be locally accurate, but the global structure is missing. (5/7)

40d23.5K13245

BOOKMARKS71RETWEETS13REPLIES3

Ekdeep Singh Lubana@EkdeepL

For the physics bros: if you think of SAE features as mere on-off switches, that oughta remind you of Ising models. You can use this to unsupervisedly discover manifolds from SAE activities! Check out code link below!

Goodfire@GoodfireAI

So instead of interpreting features in isolation, what if we searched for features that act together?

We turned this idea into an unsupervised pipeline to cluster SAE features based on firing patterns.

Together, a cluster of features reveals the overall geometry. (6/7)

40d8.2K12271

Ekdeep Singh Lubana@EkdeepL

Super excited to have this paper finally out! So many nuggets here, but a critical highlight: you should *not* interpret SAE features in isolation. The population geometry is where it's all at! Similar to this image of us @GoodfireAI folks playing out the elephant parable. :P

Goodfire@GoodfireAI

The most popular way to interpret AI is missing the bigger picture.

Models think in curved shapes. But sparse autoencoders (SAEs) work with straight lines.

Can they still capture models’ curved neural geometry? Yes, but not how you might think! (1/7)

40d6.9K12938

Thomas Fel@thomas_fel_

How do SAEs capture concept manifolds? 🍩

I think this is important work. we study how SAEs handle the geometric structures we've identified and find they tile/shatter them in a particular way we characterize, letting us recast unsupervised manifold discovery as inverse Ising

Goodfire@GoodfireAI

The most popular way to interpret AI is missing the bigger picture.

Models think in curved shapes. But sparse autoencoders (SAEs) work with straight lines.

Can they still capture models’ curved neural geometry? Yes, but not how you might think! (1/7)

40d4K7626

Ekdeep Singh Lubana@EkdeepL

For anyone coming from a neuro background, when you look at an SAE, it's inevitable that you are motivated to write a paper like above. Similar to how "neurons tile attractors", SAE atoms tile representation geometries in LLMs (slide from recent talks).

40d5712713

Mor Geva@megamor2

@CFGeek My student @OrShafran has made this attempt actually: https://arxiv.org/abs/2602.02464 Works pretty good

Charles Foster@CFGeek

The next step here is to go for direct unsupervised recovery of feature geometry from activations, rather than this two-step SAE → clustering stuff.

40d5742314

Matthew Kowal@MatthewKowal9

Frustrated your SAEs are only capturing linear structures and missing the real geometry inside neural networks? You can now fix that!

Way more to come here 🔥 stay tuned 😎

Goodfire@GoodfireAI

The most popular way to interpret AI is missing the bigger picture.

Models think in curved shapes. But sparse autoencoders (SAEs) work with straight lines.

Can they still capture models’ curved neural geometry? Yes, but not how you might think! (1/7)

40d2.6K204

Goodfire@GoodfireAI

SAEs remain useful, as long as we’re aware of their limitations.

And we have new techniques in the works that recover manifolds more directly, allowing us to understand models better and control them more effectively!

Read the full post here: https://www.goodfire.ai/research/can-saes-capture-neural-geometry#

40d289152

Goodfire@GoodfireAI

Consider the parable of the blind men encountering an elephant for the first time. Each touches a different part—the trunk, the tusk, the leg—and comes to a different conclusion about the elephant: one says it's like a tree, another says it’s like a rope, and so on. (2/7)

40d366122

Mor Geva@megamor2

@CFGeek is right! This is exactly what we tried to do with MFA:

Charles Foster@CFGeek

The next step here is to go for direct unsupervised recovery of feature geometry from activations, rather than this two-step SAE → clustering stuff.

40d1K122

Goodfire@GoodfireAI

SAEs decompose neural representations using linear directions, decoding a model’s inner world.

Researchers hoped each direction (feature) would be a single concept, with magnitude corresponding to something like intensity or confidence. (3/7)

40d269111

Goodfire@GoodfireAI

We now know that models think using curved shapes, not just straight lines. But SAE features can still give us a window into neural geometry.

How? We show that related SAE features often “tile” manifolds, pointing to different (but overlapping) regions on the curve. (4/7)

40d168101

Goodfire@GoodfireAI

@burny_tech precisely

40d38741

Goodfire@GoodfireAI

This helps explain why SAEs can feel both illuminating and unsatisfying!

40d15781

Ekdeep Singh Lubana@EkdeepL

Paper led by our awesome new hire @ushabhalla_ and our beast @thomas_fel_!

40d220101

Thomas Fel@thomas_fel_

that said, we see this mostly as a workaround (a patch on top of SAEs) and we don't see clustering SAE features post-hoc as the long-term answer.

lot of cool works coming soon 👀

paper: https://arxiv.org/pdf/2604.28119 blogpost: https://www.goodfire.ai/research/can-saes-capture-neural-geometry#

40d147101

Sarah Wiegreffe@sarahwiegreffe

@GoodfireAI what a roll-- y'all need to slow down so I can catch up on all this reading😂

39d2249

Gerard Sans | Axiom 🇬🇧@gerardsans

@banburismus_

39d6912

Victor Levoso@VictorLevoso

@GoodfireAI @burny_tech Mm but I wonder if we can't do better somehow by making something like SAE but were it has to learn to compress activations into a few manifolds somehow . But maybe just aggregating from SAE features will turn out to scale better duno. Feels kind of hacky thou .

40d881

Ekdeep Singh Lubana@EkdeepL

Github link: https://github.com/goodfire-ai/sae-manifold

40d8241