i love claude code (and actually believe in a version of rsi), but you have to admit this is pretty funny
Alex Albert says Claude now authors more than 80% of Anthropic's merged production code
Engineers ship eight times more code than in 2024.
Positive users praise Claude writing over 80% of Anthropic's production code as impressive progress toward capable agents, while negative users call the claims hype, a pump-and-dump, or fraudulent.
Most Activity
Today’s edition of my newsletter just went out.
🔗 https://www.rohan-paul.com/p/anthropic-just-disclosed-that-claude
🗞️ Anthropic says 80% of its new production code is now authored by Claude
🗞️ New Google paper shows general LLMs can solve formal math by planning proofs and checking each step. Raised general LLM performance from under 10% to 70%
🗞️ Google’s new open source Gemma 4 12B can analyze audio and video while running fully locally on a consumer 16GB GPU
🗞️ Alibaba’s Qwen3.7-Plus supports text, video, and image inputs at a low price of $0.4/$1.6 per 1M tokens, though it remains proprietary.
🗞️ Anthropic’s new chemistry report has a genuinely wild result.
Anthropic just disclosed that Claude now writes more than 80% of the production code it merges.
Before Claude Code reached research preview in 02-25, Claude wrote only low-single-digit merged code, while output per engineer has since risen to 8x the 2024 baseline.
The shift comes from agents that edit files, run tests, inspect failures, spawn helper agents, and keep working across longer tasks instead of only suggesting snippets.
Anthropic says reliable task length is doubling about every 4 months, with Mythos Preview reaching at least 16 hours and open-ended Claude Code success hitting 76%.
i.e. Claude Mythos Preview could stay useful on a task that would take a skilled human roughly 16 hours of work
Claude also moved from a 3x training-code speedup to 52x, while a skilled human reached about 4x in 4 to 8 hours on the same setup.
The remaining human edge is research judgment: choosing the right problem, trusting the right result, and knowing when an experiment is dead.

https://www.anthropic.com/institute/recursive-self-improvement

Antropic's claims about "AI Safety" are fraudulent. Their silly LLMbots will never be safe and they have no unique insight or technology to make them safe, it's all a big lie. It's literally a house of cards, and no AI researcher should support their pseudoscientific / pseudophilosophical views.

@MLStreetTalk pump and dump scheme extremely quickly. Now there only the burned ground left and no sane person wants to have anything to do with it. I feel like this is exactly what’s going to happen with the current AI bubble.

@examachine @MLStreetTalk Absolute facts

@MLStreetTalk I firmly believe that the minute Anthropic goes public the shareholders should remove him from the CEO position.

@MLStreetTalk It reminds me of NFT craze as nothing else. Believe it or not, the technology behind it was incredible at the very beginning and could actually help distinguish real content over fake once reliably as no SynthID ever could. But wrong people got involved and everything became a

@alex_peys RSI is a snitch

One of the big problems in the railroad bubble was the moral hazard and extra risk surface created by govt investment, often enabled by corruption and cronyism. This backstop allowed the problem to swell to a much greater level than we are seeing here.
With AI actually one oft noted comforting factor is that it has been underwritten by some of the most profitable companies ever made. Govt overreach is one of the classic "bubble patterns" that hasn't surfaced yet. I hope it remains this way
Otherwise hardworking ordinary people get saddled with all the downside risk, while much of the upside gains have already been realized in the private markets. That would be a colossal moral failing for which I hope democracy could give just political reprisal.

@MLStreetTalk The kinds of people who are prone to over-rotate on AI capabilities are those who are susceptible to flattery, inclined to animism, or scared of mortality and seeking a deus ex machina. Most of humanity is in at least one of those three groups.

@rohanpaul_ai 8x productivity jump sounds impressive until you realize we're just building a faster way to generate technical debt. It's like replacing a slow typist with a high-speed printer that occasionally prints in Wingdings.

@rohanpaul_ai the numbers are real. covered this in our latest recap:

I wish it were so easy to reduce ethics strictly to human preferences but as their own "constitution" stuff showed (well it's just robot laws) my theory (Godseed: https://arxiv.org/abs/1402.5380v2) was right and their approach was insufficient. So who's to say they'll get it right this time? Unlikely IMO because their assumptions about Ethics are wrong. They can't recover from subtle philosophical errors.

@rohanpaul_ai 80% production code by Claude, local multimodal Gemma 4, and math proofing by LLMs—the compute needed to track this space is getting wild. 🤯

@rohanpaul_ai task length doubling every 4 months will make full project ownership by agents normal soon

@rohanpaul_ai The metric I'd add: institutional memory debt rate.
8x merge velocity = 8x faster "why was this written this way?" accumulation.
Session is gone. Prod incident 6 months later: debugging code no one mentally wrote.
Write speed ↑ 52x. Understanding persistence: unchanged.

@rohanpaul_ai 学术圈最大的误区:用复杂工具掩盖简单问题。

@rohanpaul_ai Means the work just moved from writing code to reviewing it. In my own agent loops I'm spending way more time on PR review than authoring. I’m okay with that, and just requires a different frame of mind to execute.

@rohanpaul_ai 这增速有点吓人,代码审查压力也更大了