/AI18h ago

Silicon Data's LLM Token Expenditure Index stagnates at 1.99, signaling a slowdown in enterprise migration to premium closed models

Weisenthal argues cheap closed models handle standard developer workloads.

2819730133126.1K

Original post

Zephyr#1471

Silicon Data@Silicon_Data

Our LLM Token Expenditure Index should really have been named the “Token Expenditure Price Index” bc it’s an expenditure or usage-weighted average token price index. It tells you how much currently the entire market AI is paying for a million LLM tokens irrespective of models.

The naming might’ve led to some misinterpretations as some seem to have interpreted the index as either the total volume of token used or the average price of tokens. In reality, the index captures something more subtle than either interpretation: it tells us the marginal willingness to pay for LLM models.

Over the course of the year, while model token prices haven’t moved that much, the usage patterns have moved dramatically leading to the token index movement down and then up sharply as AI users moved en masse into using cheap open weight models and then en masse to the much more expensive frontier closed source models. From consumers to enterprises, everyone is Claude-maxxing!

More recently, as can be seen in the chart below, the token index has stagnated, which suggests that usage migration towards frontier models has slowed. Time will tell whether this is just a pause or an inflection in the trend as users move back towards open weights models. In a sense our token index could be roughly interpreted as a “quality premium” of frontier models over the much cheaper open source models (if we assume users and prices are both “rational”).

For more details on what we offer beyond the few indices we’ve listed on the Bloomberg Terminal, check us out at http://www.silicondata.com and give us a holler! 😊

3:36 PM · Jun 5, 2026 · 84.5K Views

Sentiment

Positive users endorse the stagnation in frontier model migration because most real work occurs with improving middle-tier models, while negative users dismiss the analysis as biased or inefficient.

Pos

75.0%

Neg

25.0%

8 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS7.2KBOOKMARKS1LIKES17REPLIES3

Joe Weisenthal@TheStalwart

I think some people (obviously not the folks at @Silicon_Data) have some mental model of open models as being freely usable open source software like Linux. But they still need costly chips and electricity to use. (Though SOTA open models are clearly cheaper on Chinese clouds)

14h7.2K171

Feng Tao Ning@ftning

@TheStalwart @Silicon_Data why wouldn’t you just dynamically select along the pareto frontier by your intelligence budget? it’s just the efficient frontier for CAPM dorks

14h7511

Joe Weisenthal@TheStalwart

@poiThePoi @Silicon_Data There’s plenty of lagging edge models that are orders magnitude cheaper than the SOTA ones, most notably the Gemini Flash family.

14h934

Linh Dao@LinhDaoFintech

@TheStalwart @Silicon_Data the framing always skips the point that most production workflows don't need SOTA, they need reliable and cheap — that's the real market.

14h332

Poi@poiThePoi

@TheStalwart @Silicon_Data The labs keep turning off the old models.

14h97

Auyon Siddiq@auyonomous

@TheStalwart @Silicon_Data The semantics around the "frontier" are interesting. In other contexts we accept tech supremacy is secondary to organizational capacity (e.g. military). With AI there seems to be FOMO about the frontier. May settle down once (if?) the frontier stops expanding.

14h46

Dillon Amburgey@thedillona

@TheStalwart @Silicon_Data I think there’s also a lot of ideology and motivated reasoning in the discussion

14h27

Tomer Stern@tomer_stern

@TheStalwart @Silicon_Data Yea I imagine for most enterprise account users of Claude, sonnet is way more popular than opus.

The people running opus agents 24/7 are probably mostly on personal Claude max plans or are a hyper privledged user with $5,000+ a month spend limit

14h27

Anthony Bardaro@AnthPB

@TheStalwart @Silicon_Data Yea, that…

14h602

AJ@chiwizardry

@TheStalwart @Silicon_Data You live in new York and aren't watching the Knicks??

14h55

Joe Weisenthal@TheStalwart

@thedillona @Silicon_Data Say more

13h33

Tom@BobbleheadBrett

@TheStalwart @Silicon_Data Not now Joe the Knicks are playing

14h32

Dwee Bwae@DweeBwae

@TheStalwart @Silicon_Data It's definitely possible. The reason deepseek is open is arguably to ram home that undercutting is inevitable and this will be a largely commodity game, but the open-ness won't be exploited by most people. Except maybe as a cheap dev service or something.

14h29

Michael Booth@_MichaelBooth_

@TheStalwart @Silicon_Data I think it is definitely true that there is a large overlap between people heavy into AI usage and people who had a decently powerful computer before AI took off for other reasons. So the over-index how much compute the avg person has access to locally (I have done this)

14h24

murdarch@murd_arch

@TheStalwart @poiThePoi @Silicon_Data And they're really good. Claude-- figure out what to do Flash- do it over and over again.

14h17

Auyon Siddiq@auyonomous

@TheStalwart @Silicon_Data A lot of this is inherited from the model benchmarking norms in computer science. It's funny to think about how that narrow technical question about SOTA has overtaken AI discourse, all the way up to questions about national security and state capacity

14h16

Tomer Stern@tomer_stern

@TheStalwart @Silicon_Data If you get $100 of API spend a month and have a coding based job, you simply are not using the top tier model for most things

14h12

E. Morales@ElianWorldView

@TheStalwart @Silicon_Data This. Non-SOTA closed models are the missing middle. Most actual deployment lives there.

14h7

Bitcoin Board@btc_board

@TheStalwart @Silicon_Data most real work happens in the boring middle not every task needs a lambo brain

14h6

Otto Zampulu@zampulu92728

@TheStalwart @Silicon_Data Which models?

14h6