/AI4h ago

Andrej Karpathy's autoresearch experiment shows AI agent performance does not plateau when test-time compute budgets scale

Performance improved steadily over hundreds of sequential experiments.

141155539.7K
Original post

Noam is politely reminding us that if money is no issue, we have a (jagged) superintelligence already. Money is no issue for agents helping with internal research.

Noam Brown@polynoamial

http://x.com/i/article/2057694226981257216

10:03 PM · Jun 8, 2026 · 9.6K Views
Sentiment

Users sarcastically dismissed claims that AI performance shows no plateau even with massive test-time compute by equating it to a phone calculator outperforming humans at basic math.

Pos
0.0%
Neg
100.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS585BOOKMARKS1LIKES5

@teortaxesTex MirrorCode blog was saying that

3hViews 585Likes 5Bookmarks 1
REPLIES2
Candide III@CandideIII

@teortaxesTex > Money is no issue for agents helping with internal research. It is though (insofar as money is an issue for AI companies). There is opportunity cost. Those GPU-hours might have been spent generating pr0n and nutrition recommendations for paying customers

3hViews 45Likes 2
dani@absenteewarlord

@teortaxesTex if things worked this way across the board you would expect to see all of theoretical math get demolished by labs burning $10m per open problem

4hViews 71Likes 1
Plastic Soldier@PlastiqSoldier

@absenteewarlord @teortaxesTex You know the AI labs have actual jobs to be doing, right? Also, when Google actually used their AI for cutting-edge science they won a Nobel Prize.

3hViews 14Likes 1
Space Man@starsailing11

@teortaxesTex Definitely some sparks of RSI happening at OpenAI the past couple weeks

4hViews 37
Eric23332@eric23332

@teortaxesTex Even the calculator on your phone is a jagged superintelligence. Far better at long division than you.

3hViews 20
wina@snapherantlers

@CandideIII @teortaxesTex Peter Steinberger burns 1.5M USD worth of tokens every month trying to maintain the katamari that is OpenClaw and OpenAI doesn't seem to care. So we can assume that the cost ceiling for agents aiding research is much higher.

3hViews 4Likes 1
MM James@WlfMathS

@teortaxesTex we have AGI, just not unlimited tokens

3hViews 13
Candide III@CandideIII

@teortaxesTex If it's cheaper than market then someone's footing the bill

3hViews 10
dani@absenteewarlord

@PlastiqSoldier @teortaxesTex 10m for a major open problem is a way better use of money than 10m spent on marketing.

3hViews 9