/Tech16h ago

Anthropic Claims 8x Engineer Productivity As API Leaks Customer Data

4724.3K3377591.7M

#213

Original post

Joseph Suarez 🐡@jsuarez

This post right here officer Let me know when your engineers ship 8x LESS code

Anthropic@AnthropicAI

Today, Anthropic engineers on average ship 8x as much code per quarter as they did compared to 2021-2025.

4:38 PM · Jun 4, 2026 · 40.2K Views

Sentiment

Some users express optimism about AI accelerating progress from Anthropic's 8x code claim, while many dismiss the metric itself as arbitrary marketing or flawed due to review costs and meaningless numbers.

Pos

33.3%

Neg

66.7%

8 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS1.7K

wj@woo0057

@AnthropicAI and yet you don't know why your models are telling people to go to sleep after some tasks. "could revolutionize society". Get a better look at society first before you start claiming you're going to help society. People out here are celebrating a murder of a ceo.

1d1.7K3

BOOKMARKS4LIKES6

Jake@JakeKAllDay

@jsuarez This is why we have a leaderboard that we review at the end of our sprint: who shipped the most lines of code (it usually our SWE II on a PIP)

1d23564

RETWEETS317

Anthropic@AnthropicAI

Today, Anthropic engineers on average ship 8x as much code per quarter as they did compared to 2021-2025.

1d1.8M4.1K757

REPLIES2

Gen Z Mind@gen_z_mind

This may help explain why Claude Code sometimes makes unnecessary tool calls, and why users see errors such as, ‘Your message was sent, but Claude couldn’t respond. Try again.’ I would strongly encourage the team to focus more effort on testing and validating feature behavior, reliability, and performance before release. Recent changes appear to have reduced existing capabilities or introduced regressions that cause users to waste tokens retrying failed interactions. More validation/testing would help preserve user trust and improve the overall developer experience. Thank you ! 🙏

1d2701

meowbooks@meowbooksj

@jsuarez @jeremyphoward yep

1d83531

Dan Advantage@DanAdvantage

@jsuarez yeah. obviously a pointless, useless measure-turned-marketing-tool *just like tokens spent* how about tokens spent per lines of code shipped while we're at it

1d7334

Yisheng Jiang@YishengJiang

@jsuarez Sometimes shipping code is good

1d2141

John@jrysana

@jsuarez Let me know when they delete 8x more code

1d6273

Saarth Shah@saarth_

@AnthropicAI What is the obsession with lines of code?

1d1101

Daniel Brooks@rackSpreader1

@dsmproengineer @jsuarez found the vibe coder 😂 ^

1d804

Tony@shishanyu

@cgarciae88 …and got so far …

16h1343

Mo@mosyaseen

@AnthropicAI measuring engineer output in lines of code in 2026. might as well pay the agent by the word. it'll write you a novel to rename a variable and bill you for the trilogy.

1d5555

DSM Pro Engineering@dsmproengineer

@jsuarez people that think less code is better don't know how computers work.

1d721

AI Subscription Deals@CheapAIToken

@jsuarez 8x more code is not the flex if review cost also scales.

1d8393

Ange-Emmanuel Kouakou@arelaxedscholar

@AnthropicAI They ship 8x more code, cause they ship, then ship a fix, then a fix for the fix, ...

Code production is a stupid metric, I thought we had already went over this as an industry.

8h944

Ali Shaheen@AliShaheen

Here is my personal graph - same shape as Anthropic's curve. Same inflection point.

Here's what moved the needle - and none of it is exotic:

→ Layered knowledge. A CLAUDE.md system the agent reads before every action. Discipline, not prompting.

→ Memory that survives. Context that holds within a session and carries across them - no starting from zero every morning. Memory management is what makes or breaks any AI system.

→ A pipeline with zero manual steps. Ticket → tests-first → code → review → PR. Same path every time.

→ Parallel agents, each in its own isolated container. Many tickets at once, no merge chaos.

For me, the next step was, unlike Anthropic and OpenAI, I could not spend millions on tokens. I didn't want to compromise on the quality, but wanted to cut the token spent + ensure best practices could be applied consistently across the team.

So once the memory and context layers were solid, I could route intelligently - local models (Qwen), Kimi, DeepSeek and Gemini for the cheap work, Claude only for the hard stuff. The system learns which is which. More output, lower spend. This has been hard work, lots of experiments and fine tuning (more on it later); but it's well worth it.

It's all open source for anyone to use or adapt: http://github.com/alinaqi/maggy

And I've now built a hosted version - http://www.srooter.ai - to put a real governance layer on top: enforce best practices across a team and push engineering performance without growing token spend, all from inside Claude Code, Codex and the tools you already use. #AIEngineering

1d91

🧭Postcards From The Wild West@_TenMoreYears

@AnthropicAI Sounds like your engineers haven't figured out Ship velocity =/= deployment, adoption velocity, or monetization velocity.

And nobody is asking for 8x code. Look around and smell the coffee.

Welcome to the real world.

1d7602

Abror Aliboyev@aliboyev_com

Are the number of bugs proportional to the size of the codebase, or are they disproportionately higher? On average, how many times is each type of bug fixed? What percentage of the total lines of code (LOC) is dead code, and does this LOC count include only executable code or also blank lines and comments?

1d1.3K1

あいり｜海外AIニュースを毎日届ける人@airiaiai8

@AnthropicAI 日本語で解説を書きました Japanese breakdown here:

1d1.2K1

itchy@itchy_est

@DanAdvantage @jsuarez

1d35