/AI9h ago

Meta FAIR's François Fleuret argues Transformers function like librarians, while RNNs are required for true reasoning

Luca Ambrogioni noted that designing such RNNs remains unsolved.

576193225036.4K
Original post
François Fleuret@francoisfleuret#330inAI

Hot take: Transformers are all-seeing ultrafast librarians. They have a very low incentive to extract and organize information, they can just "look around" to see correlating fragments.

RNNs done properly would have far stronger "conceptual embeddings" and would actually think.

10:16 PM · Jun 7, 2026 · 30K Views
Sentiment

Positive users agree RNNs could form stronger concepts than transformers by compressing meaning into limited compute, while negative users argue transformers became superhuman exactly because RNNs were never implemented properly.

Pos
66.7%
Neg
33.3%
6 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS3.7KBOOKMARKS7LIKES61RETWEETS1REPLIES5

@francoisfleuret Except for the many real cases where you don't know yet that something will be important until you see something in the future. RNN is toast, Transformer handles it trivially.

François Fleuret@francoisfleuret

Hot take: Transformers are all-seeing ultrafast librarians. They have a very low incentive to extract and organize information, they can just "look around" to see correlating fragments.

RNNs done properly would have far stronger "conceptual embeddings" and would actually think.

7hViews 3.7KLikes 61Bookmarks 7
Andrew Bean@AndrewBean

My intuition is opposite - that conscious thought is primarily from sparse activations roughly corresponding to distinct concepts that basically get looked up through learned associations with other such concepts (symbolic reasoning-like). I'm imagining the RNN "strong conceptual embeddings" as fairly dense and opaque, which I would more think of as unconscious or intuitive.

8hViews 143Likes 2Bookmarks 1
Loic cabannes@loiccabannes

@giffmana @francoisfleuret Except if we accept we have tools and can behave like humans by doing search over the past only when necessary and in a principled manner: https://arxiv.org/html/2510.14826v1

Best of both worlds: no infinite KV Cache and yet everything is still retrievable.

6hViews 124Likes 2Bookmarks 1

@francoisfleuret Conceptually, without literally remembering everything, they can condense, memorize and retrieve all the ideas and concepts they have been exposed to. Expensive to train today, in the future your "personal database" will be one just continually/continuously trained. You 2nd brain

3hViews 212Bookmarks 1
Dersu@tak3sh8

@francoisfleuret Aren't there a few communities trying to do that, for quite some time?

8hViews 456Likes 2

@francoisfleuret Hotter take: I’m running a cognition substrate that doesn’t use tokens at all.

155+ days continuous operation, learning through thermodynamic state evolution no transformers, no RNNs, no resets.

https://zenodo.org/records/19703134

5hViews 22Bookmarks 1
Jorge Alberto@JorgeA77832

@francoisfleuret RNNs over Transformers in the big 2026???

9hViews 64Likes 2

@francoisfleuret Unfortunately nobody ever figured out how to do it properly

François Fleuret@francoisfleuret

Hot take: Transformers are all-seeing ultrafast librarians. They have a very low incentive to extract and organize information, they can just "look around" to see correlating fragments.

RNNs done properly would have far stronger "conceptual embeddings" and would actually think.

9hViews 576Likes 4Bookmarks 0
Evi@geteviapp

@giffmana @francoisfleuret Desire to “fight” something seemingly simple is not new, there were lots of attempts to eg create something other than transistor or even something other than using binary in computer, some attempts even looked like they might work, eg ternary based: https://en.wikipedia.org/wiki/Setun

3hViews 63Likes 1
drorlb@drorlb

@francoisfleuret Wasn't the first part basically the point of "Hopfield Networks are All You Need"?

8hViews 113Likes 1
Atakan Tekparmak@AtakanTekparmak

@francoisfleuret Ideally an architecture they utilises the pros of attention and RNNs & derivatives, hell, even the cons of RNNs could be really good

7hViews 66Likes 1
Marco Matthies@MarcoMatthies

@francoisfleuret Hotter take :) Context compaction makes our current agents into RNN-transformer hybrid systems

7hViews 59Likes 1

@francoisfleuret @mike64_t https://goombalab.github.io/blog/2025/tradeoffs/

3hViews 56Likes 1
el ayar yacine@YacineAyar

@francoisfleuret Reading theses two paper given valuable information, and prove some advantage of DPLR RNN over trarnsformer reading expressivity

https://arxiv.org/abs/2311.00208v3

https://arxiv.org/abs/2603.03612

2hViews 173
Gary Basin@garybasin

@francoisfleuret @mike64_t Transformers in a loop are much stronger, no?

3hViews 154
Simon@SimonGoodman_

@francoisfleuret You would have loved @tserre keynote at CVPR

3hViews 109
Loic cabannes@loiccabannes

@francoisfleuret H-net (Albert Gu) shows that SSMs are strictly better at compressing information and generating relevant chunks of tokens that transformers. Papers analysising correlation between models and the brain also show that SSM correlate better with brain activations that transformers

6h
Simon@SimonGoodman_

@giffmana @francoisfleuret That’s for known RNN architectures I believe, we did not find an way to model and scale long range dependencies in RNN but maybe one day we will

3hViews 85
Fraser@FraserGreenlee

@francoisfleuret And this is because RNNs have to compress everything they've seen into a single hidden state?

7hViews 79
Load more posts