/AI2h ago

Together AI's Ashwinee Panda argues frontier AI milestones come from optimized pretraining rather than moonshots like continual learning

Nathan Lambert warned against designing AI around human cognitive benchmarks

2300103.8K
Original post
Ashwinee Panda@PandaAshwinee#1354inAI

continual learning did not bring us gpt-5.5 codex, better pretraining did. obsessive pragmatism in pretraining is what fuels the success of the latest claude models, not romanticizing "what could be" if we cracked some ill-defined moonshot.

Nathan Lambert@natolambert

I feel like the obsession with continual learning / sample efficiency leads the field in the wrong direction. It's the bad career strategy of focusing on addressing your weaknesses instead of maximizing your strengths.

Yes, there is an existence proof in the human brain, but it doesn't by any means guarantee that that'll be the most interesting AI. It may require $100T of R&D on chips and AI methods to get that unlock.

On the other side of things, it's obvious that the coming models are extremely transformative and built on technologies that we already have. There's great reason to focus on just maximizing this. In reality, this is what the frontier labs are doing. They're going as fast as possible down the current development tree. This is good for progress and mixed for safety/geopolitics.

Things like "automate white color work" and "replace the AI researcher job" are the guesses of labs because it's super hard to imagine futures for what these dramatic technologies will be. Don't take the labs too seriously about this being the exact goal. The exact goal is to push the frontier and monetize later.

Solving continual learning, sample efficiency, etc would be great, but its trying to predict when a scientific breakthrough will come instead of trying to grapple with how the 100% sure thing coming technological revolution will change our lives.

This isn't to say the Dwarkesh post is bad, it addresses some reasonable critiques, but it is the least bitter lesson pilled thing to be obsessed with human intelligence and how that can inform AI.

We are in the AGI era of research. This is about embracing the unknown, scaling resources, and seeing what is enabled by making a series of magical tweaks to complex recipes that build frontier models. Lean into the alchemy.

(it should be pretty clear that I personally, investing in open research agree we need fundamental science -- just not agreeing that this is what the "cutting edge of the frontier" is governed by)

5:26 PM · Jun 8, 2026 · 3K Views
Sentiment

Users express disgust at continual learning and compaction ideas, finding them off-putting amid arguments favoring scaling current AI models instead.

Pos
0.0%
Neg
100.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS1.1KLIKES5REPLIES1
Nathan Lambert@natolambert

@PandaAshwinee efficiency is definitely important, but I've seen rapid progress in things like compaction, and I get the ick when people say we can do it bc humans have it

Ashwinee Panda@PandaAshwinee

continual learning did not bring us gpt-5.5 codex, better pretraining did. obsessive pragmatism in pretraining is what fuels the success of the latest claude models, not romanticizing "what could be" if we cracked some ill-defined moonshot.

1hViews 1.1KLikes 5Bookmarks 0
Ashwinee Panda@PandaAshwinee

@natolambert it = compaction? continual learning? (both of them give me the ick)

1hViews 44
Nathan Lambert@natolambert

@PandaAshwinee All of the above 🤷‍♂️

1hViews 30Likes 1