/AI6h ago

Christopher Potts, Bigspin AI co-founder, says Anthropic's Opus 4.6 suffered "tokenflation" by requiring 4.38 times more tokens for the same work

Code survival rates rose from 90% to 95%.

192613412022.1K
Original post
Christopher Potts@ChrisGPotts#225inAI

See my blog post with @mmooritz for additional details on these analyses: https://bigspin.ai/resources/the-decline-of-token-level-purchasing-power

Christopher Potts@ChrisGPotts

Does a token buy you more or less now than it did a few months ago? We built a consumer price index (CPI) for AI coding output from Anthropic's Opus 4.6 model in SWE-chat, Feb 5–Apr 15, 2026. What we find looks like tokenflation:

9:24 AM · Jun 8, 2026 · 1.2K Views
Sentiment

Many users praised the tokenflation study on Anthropic Opus coding models as genius for revealing declining token value in outputs, while others dismissed the analysis as overlooking hardware gains.

Pos
66.7%
Neg
33.3%
7 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS18.1KBOOKMARKS114LIKES235RETWEETS31REPLIES12
Christopher Potts@ChrisGPotts

Does a token buy you more or less now than it did a few months ago? We built a consumer price index (CPI) for AI coding output from Anthropic's Opus 4.6 model in SWE-chat, Feb 5–Apr 15, 2026. What we find looks like tokenflation:

6hViews 18.1KLikes 235Bookmarks 114
Christopher Potts@ChrisGPotts

The goods in our CPI basket are standard engineering outcomes like PRs, but we also included knowledge capture (any time work is stored somewhere durable), which shows the most positive trend and which resonates with us as heavy users of these tools.

Christopher Potts@ChrisGPotts

See my blog post with @mmooritz for additional details on these analyses: https://bigspin.ai/resources/the-decline-of-token-level-purchasing-power

6hViews 997Likes 9Bookmarks 1

Nominating “Tokenflation” for early-decision word-of-the-year for 2026.

Christopher Potts@ChrisGPotts

Does a token buy you more or less now than it did a few months ago? We built a consumer price index (CPI) for AI coding output from Anthropic's Opus 4.6 model in SWE-chat, Feb 5–Apr 15, 2026. What we find looks like tokenflation:

6hViews 787Likes 6Bookmarks 1
Kawin Ethayarajh@ethayarajh

I love the notion of 'token-level purchasing power'.

Still, it's really hard to create a stable 'basket of goods' when the frontier of model abilities is changing so quickly. Much like there is no automotive analog of a Tesla FSD from the 90s, a lot of what coding agents can do now could simply not have been done even a few months ago!

My vibes-based take is that I'm getting more value from these tools over time. Whether that's happening on a token-amortized basis is unclear, because more and more tokens are being spent on hidden reasoning.

Christopher Potts@ChrisGPotts

Does a token buy you more or less now than it did a few months ago? We built a consumer price index (CPI) for AI coding output from Anthropic's Opus 4.6 model in SWE-chat, Feb 5–Apr 15, 2026. What we find looks like tokenflation:

5hViews 458Likes 1Bookmarks 1
Christopher Potts@ChrisGPotts

Our CPI includes a hedonic adjustment based on the code survival rate. This goes from about 90% to 95% in our time period, so we incorporate a uniform 5% adjustment. This softens the tokenflation; code output seems to be improving!

Christopher Potts@ChrisGPotts

The goods in our CPI basket are standard engineering outcomes like PRs, but we also included knowledge capture (any time work is stored somewhere durable), which shows the most positive trend and which resonates with us as heavy users of these tools.

6hViews 773Likes 4Bookmarks 0
Christopher Potts@ChrisGPotts

To give you a concrete sense for what the above declines mean, here are the estimated outcomes for a 10K-token session:

Feb 2026: a third of a PR, 630 lines of code drafted, 8 files touched, 1 knowledge capture.

Apr 2026: a sixth of a PR, 91 lines of code drafted, 4 files touched, less than half a knowledge capture.

6hViews 52Likes 3
Christopher Potts@ChrisGPotts

What is causing the decline in token-level purchasing power? One relevant thing we know is that the job of an Opus 4.6 token changed dramatically in this period, from purely writing code in February to coding, thinking, and explaining in mid-April:

6hViews 41Likes 3
Christopher Potts@ChrisGPotts

We would love to expand our analysis to other models, providers, and time periods. We've created a Claude Code tool/plugin that you can use to analyze your own usage. The tool shares no data with us, but perhaps we could all pool our anonymized data to learn more about the true value of our tokens: https://github.com/bigspinai/plugins

6hViews 115Likes 2
Christopher Potts@ChrisGPotts

We ourselves derive significant value from these tools and wonder whether measuring other relevant outcomes would change the CPI. That said, it seems clear that AI is becoming more expensive in ways that might be outpacing its current value.

6hViews 41Likes 2
Kawin Ethayarajh@ethayarajh

I love the notion of 'token-level purchasing power'.

That said, it is really hard to create a stable 'basket of goods' when the frontier of model abilities is changing so quickly. Much like there is no automotive analog of a Tesla FSD from the 90s, a lot of what coding agents can do now could simply not have been done even a few months ago!

My vibes-based take is that I'm getting more value from these tools over time. Whether that's happening on a token-amortized basis is unclear, because more and more tokens are being spent on hidden reasoning.

Christopher Potts@ChrisGPotts

Does a token buy you more or less now than it did a few months ago? We built a consumer price index (CPI) for AI coding output from Anthropic's Opus 4.6 model in SWE-chat, Feb 5–Apr 15, 2026. What we find looks like tokenflation:

5hViews 43Likes 1Bookmarks 0
Christopher Potts@ChrisGPotts

Another link to the associated blog post: https://bigspin.ai/resources/the-decline-of-token-level-purchasing-power

6hViews 89Likes 3
Scratch@scratchdotmd

@ChrisGPotts a token price index (TPI) is genius, thanks for doing this

43mViews 6Likes 1
Scratch@scratchdotmd

@ChrisGPotts if you want another opinion on things to include in the basket of goods, happy to help

40mViews 4
ueaj@_ueaj

@ChrisGPotts A 90->95% survival rate should do more than increase PPP by 5%, the last 10% of PRs are 10x more valuable. You can't really do an inflation curve without actual price signals

5hViews 89Likes 1
efe@extliqprovider

@ChrisGPotts this is an amazing project btw

5hViews 51
FinanceAI Labs@financeailabs

Tokenflation is the most honest thing to come out of AI research in months. You are paying more tokens for less output as model providers optimize for engagement over quality. The subscription fees go up, the CLAUDE.md files get longer, and the actual productivity gains remain a study in a deck somewhere.

2hViews 10

@ChrisGPotts This is like updating from Windows 95 to Vista and complaining that everything takes longer while everyone else updated their hardware (model) and now has a 2x more capable system. You simply didn't update the hardware, the OS works.

1hViews 9

@ChrisGPotts Tokenflation is the right frame. The bill only looks cheap if each token still buys useful engineering work.

2hViews 7
Guilherme O'Tina@guilhermeotina

@ChrisGPotts the framing that might matter more: what's the total cost to reach a fixed quality bar, not cost per line of output. if the first draft gets worse but the fix loop is faster, the CPI looks bad while the real outcome improves. time-to-acceptable-PR feels like the right unit

6hViews 7
Load more posts