/AI3d ago

Lukasz Kaiser Examines Transformer Limits and Coding Agent Gains

31771318126K

#195

Original post

Rishabh Agarwal#195

Xander Dunn@xanderai

tl;dr It’s the Age of Research

Something is off about models: 1) Training is so sample inefficient 2) Very long thinking trajectories: models will do the right thing only after exhausting all other possibilities 3) Generalization is whack. Waymo can’t handle construction on highway. No teenager would have this problem.

Unclear why December was an inflection point for coding agents. No single clear attributable change. Google is still pre-December on coding capabilities.

5-10x speed up in work. 3 weeks to implement paper in ye olden days. 2 days with codex + can do many things in parallel.

Humans see, hear, talk, everything all at once. Of course that’s how it should be for models.

Big update on how quickly we got to an intern level coding agent. Didn’t expect that in 2025.

8:41 PM · Jun 3, 2026 · 26K Views

/AI3d ago

Lukasz Kaiser Examines Transformer Limits and Coding Agent Gains

31771318126K

#195

Original post

Rishabh Agarwal#195

Xander Dunn@xanderai

tl;dr It’s the Age of Research

Unclear why December was an inflection point for coding agents. No single clear attributable change. Google is still pre-December on coding capabilities.

5-10x speed up in work. 3 weeks to implement paper in ye olden days. 2 days with codex + can do many things in parallel.

Humans see, hear, talk, everything all at once. Of course that’s how it should be for models.

Big update on how quickly we got to an intern level coding agent. Didn’t expect that in 2025.

8:41 PM · Jun 3, 2026 · 26K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS411LIKES2

echo.hive@hive_echo

@xanderai I am gonna say a boring thing but LLMs still have small brains

Training a dog is also not very sample efficient either, many hundreds of reps

Maybe bigger will make the offness go away? 🤷‍♂️

3d4112

BOOKMARKS1

T@twiz19071051

@xanderai The big versions of the models don't have these problems, but they're impossible to serve at a reasonable price.

2d5911