/AI17h ago

Nathan Lambert of Interconnects argues AI self-improvement is lossy and will not trigger a recursive intelligence takeoff

Google DeepMind's Andrew Trask agreed, citing organizational friction.

323702316044.6K
Original post
Nathan Lambert@natolambert#64inAI

I still stand by this despite the recent Anthropic post. There are still serious bottlenecks in building the model that the agents don’t address (organizational, compute, data access, etc).

It’ll take time to push through them and we will see "linear" gains for years to come.

5:28 PM · Jun 5, 2026 · 26.5K Views
Sentiment

Positive users see bottlenecks as surmountable via modest models or continual learning for imminent AGI, while negative users reject recursive self-improvement claims as limited or overhyped.

Pos
50.0%
Neg
50.0%
7 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS11.2KBOOKMARKS32LIKES132RETWEETS6REPLIES17
Lisan al Gaib@scaling01

I think by the end of the year everyone in AI will be RSI-pilled. Personally, I have never believed more in RSI and ASI being live before 2029.

I believe in an ermergent composability that naturally falls out of scaling.

And the following example feels very much like an argument you would have made before Mythos' cybersecurity and GPT-5.5's math breakthroughs:

"Agents will do well at optimizing single metrics, but the leap required to navigate many metrics at once is a very different skill set."

The leap from optimizing a single metric to multiple metrics is very natural for language models.

The same way they figured out how to connect tokens to build words, connect words to build sentences, and the same way Mythos figured out how to build more complex exploits out of multiple small vulnerabilities.

Where else would improvements in loss come from when models already know how to tackle individual problems?

The answer is by going up 1 abstraction layer and seeing how multiple problems fit together.

Models are smoothly transitioning from a local view of the world to a more global/holistic view.

All you need to do is scale the amount of compute you put in. And compute is being scaled like crazy.

That said, I still believe a true AGI-like system needs continual learning and that these coming automated researchers will be narrowly focused on coding and math. But more domains will follow as labs gather more data, because if you don't have continual learning you need the model to be trained on everything.

Enjoy the weekend.

Nathan Lambert@natolambert

I still stand by this despite the recent Anthropic post. There are still serious bottlenecks in building the model that the agents don’t address (organizational, compute, data access, etc).

It’ll take time to push through them and we will see "linear" gains for years to come.

6hViews 11.2KLikes 132Bookmarks 32
Nathan Lambert@natolambert

https://www.interconnects.ai/p/lossy-self-improvement

Nathan Lambert@natolambert

I still stand by this despite the recent Anthropic post. There are still serious bottlenecks in building the model that the agents don’t address (organizational, compute, data access, etc).

It’ll take time to push through them and we will see "linear" gains for years to come.

17hViews 4.1KLikes 21Bookmarks 20
Nathan Lambert@natolambert

@scaling01 I mostly just don’t like the recursive word, think the singularity is real, and stuff like that.

All these methods leveraging more compute to build the model obviously make big gains.

But words matter. I don’t expect a sci fi like trajectory to be super fast.

Lisan al Gaib@scaling01

I think by the end of the year everyone in AI will be RSI-pilled. Personally, I have never believed more in RSI and ASI being live before 2029.

I believe in an ermergent composability that naturally falls out of scaling.

And the following example feels very much like an argument you would have made before Mythos' cybersecurity and GPT-5.5's math breakthroughs:

"Agents will do well at optimizing single metrics, but the leap required to navigate many metrics at once is a very different skill set."

The leap from optimizing a single metric to multiple metrics is very natural for language models.

The same way they figured out how to connect tokens to build words, connect words to build sentences, and the same way Mythos figured out how to build more complex exploits out of multiple small vulnerabilities.

Where else would improvements in loss come from when models already know how to tackle individual problems?

The answer is by going up 1 abstraction layer and seeing how multiple problems fit together.

Models are smoothly transitioning from a local view of the world to a more global/holistic view.

All you need to do is scale the amount of compute you put in. And compute is being scaled like crazy.

That said, I still believe a true AGI-like system needs continual learning and that these coming automated researchers will be narrowly focused on coding and math. But more domains will follow as labs gather more data, because if you don't have continual learning you need the model to be trained on everything.

Enjoy the weekend.

3hViews 1.5KLikes 27Bookmarks 3
Lisan al Gaib@scaling01

@natolambert i believe in scifi trajectories and i am hoping for a deliberate slowdown

Nathan Lambert@natolambert

@scaling01 I mostly just don’t like the recursive word, think the singularity is real, and stuff like that.

All these methods leveraging more compute to build the model obviously make big gains.

But words matter. I don’t expect a sci fi like trajectory to be super fast.

3hViews 468Likes 5Bookmarks 0
deep Manifold@BetaTomorrow

@natolambert Recursive self-improvement remains fundamentally limited because any real human advancement is defined by our deliberate refusal to let systems autonomously expand their own boundary conditions.

and as long as we keep the 1st and 2nd Amendments intact, we will be fine as humans

12hViews 215Likes 1Bookmarks 1
Andrew@Hunter171270

@scaling01 I don't think we'll need models larger than 20T. I'm somehow sure 20T is more than enough to match human-level. A human have about 140T synapses, but a lot of duplicates and LLMs are at least 10 times more efficient at storing info.

6hViews 134

@natolambert silly people and silly organizations getting in the way of scaling laws. how dare they.

Nathan Lambert@natolambert

I still stand by this despite the recent Anthropic post. There are still serious bottlenecks in building the model that the agents don’t address (organizational, compute, data access, etc).

It’ll take time to push through them and we will see "linear" gains for years to come.

16hViews 1.3KLikes 2Bookmarks 0
Andrew@Hunter171270

@scaling01 Continual learning + low latency prefill (some subquadratic attention) + more active params (10-20% instead of ~5-7%, 1e28+ post-training, better pre-training (with more multimodal data for better world-model) and a it's AGI IMO.

5hViews 35
Adrian Chan@gravity7

@natolambert I dropped this into my curated Arxiv archive and surfaced related research, for those interested: https://whitepapers.gravity7.com/related/lossy-self-improvement-by-nathan-lambert/

5hViews 88Likes 1

@scaling01 Even Minmax is RSI pilled, I think that very soon the entirety of the chinese AI ecosystem will be RSI pilled, as this is their best shot to keep competing with American closed source AIs, and potentially significantly narrow the gap

5hViews 149
Guilherme O'Tina@guilhermeotina

@natolambert generation now outpaces human verification by a lot. the constraint shifts from 'can we build it' to 'can we confirm it's good enough to ship.' labs with tight eval loops pull ahead while fast generators pile up unverified artifacts

17hViews 128
chair@tablefourthree

@scaling01 What do you think are the chances the anti-AI crowd or doomers successfully get USG to slow down AI and push out AGI by like 5, 10, 50 years?

5hViews 101

@scaling01 ASI takes over the world but it still uses markdown files for memory

5hViews 81
Guilherme O'Tina@guilhermeotina

@scaling01 fun that you link 'lossy self-improvement' which argues rsi breaks down under friction. i think both fit — rsi as direction, lsi as the mechanism that makes the timeline less exponential than the map says

5hViews 67
Alex YGift@Radipdegen

@scaling01 RSI pill hitting the timeline harder than my morning coffee

scaling might solve it, but these bottlenecks feel more human than technical

6hViews 59
Rugbist@rugbist_

@scaling01 the "x-pilled" branding is doing a lot of heavy lifting here

what specific emergent behavior are u betting on most?

6hViews 46
Fadi Al-Majd@GlobalFadi

@natolambert Agreed. I still think agents can improve workflows, but they don't remove compute, data access, or org bottlenecks.

16hViews 44
Andrew@Hunter171270

@scaling01 plus i think we'll need something like neurolese from AI 2027 (Coconat), or depth recurrence.

5hViews 14
Abhinav Sharma@abhinavsharma

@natolambert Yeah LLMs have only improved in predictable ways in the last 2 years. Impressive, but predictable, not surprising.

13hViews 14
Healthy Anon@arimedai

@scaling01 RSI-pilled by year end? Maybe in the lab. Binding constraints right now are institutional. Ask anyone who's tried to get production access to training data at scale.

3hViews 12
Load more posts