If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
Garry Tan blamed corporate skill gaps for the challenges
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
Positive users praise startups cutting LLM token costs and outpacing big tech via better incentives and productivity gains, while negative users call the claims dubious or dismiss AI as worthless.
Curiously enough I did office hours today with a startup that cuts companies' LLM token costs by optimizing requests. They can cut costs by about half, which they split with the customer. So the TAM is a quarter of the model companies' corporate revenue. That's a big TAM!
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
Skill issues at big company means small new ones can eat their lunch
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
👇👇👇👇👇👇
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.

@paulg Or just do this :)

@paulg I also think the endgame of LLM routing looks like a mixture of 80% fine tuned local models/ 20% frontier lab models

@paulg "The very processes and values that constitute an organization’s capabilities in one context, define its disabilities in another context."
-Clayton Christensen

@paulg We are building this with an open source core engine @modelmeld . Approximating TAM as splitting savings with customers is the wrong way to look at it IMO because very hard to validate what costs "would have been". https://github.com/modelmeld/modelmeld

@rickasaurus Their valuations are bets on the probability of this outcome.

@paulg Adapt or die. A tale as old as time.

Um, yeah. I’m not sharing details on my defense. I have haters. When I became vocal about the comparison between how some companies code and how sophisticated hackers exploit, it opened me up to revenge attempts.
I only ever messed with blackhats in places like CryptBB, and they get extra pissed when you fry their system or expose how fragile their setup really is.These Kali kids aren’t used to systems-level exploitation. They’ll go all the way around their ass just to get into Google. They’re tool users, not systems thinkers. That’s the issue with tech now. One generation had to learn things the hard way, which was better for developing real understanding. You had to break things, trace things, rebuild things, and actually understand the system. Then the next generation grows up inside polished products and prebuilt tools. They become productized employees: trained to operate the interface, not understand the machine underneath it and thats how craft gets replaced by workflow.
I haven’t hacked in years except cod cheaters. Little bastards. Change subject.
Tell me more on quantum lab at vandy. Im not far

@paulg The false premise is token costs won't approach software costs.
For some reason, every time you ask chatGPT to solve a Rubiks cube, it regenerates the same code over 8 minutes. Everyone is very wasteful right now.
We invented a new primitive that reduces this cost to 0.

@paulg The fatal mistake is expecting cost reductions in IT. That line item is only going to get bigger as % net sales. All token ROI needs to be in COGS and classically stubborn operating lines, such as legal and leases.

People should be assigned a threshold of tokens, then they will start making better decisions.
It's easy to get so lazy with an LLM and ask it silly things like "make a screenshot of all views of the app, open them and let me know what you think" instead of just looking at the app yourself..
The best employees will be the ones that bring better results with less token usage... it should definitely be a metric..

Hi 👋 Paul , keep all saved 99% tokens from below ⬇️
As someone who builds AI agents every day, token usage quickly turned into a major bottleneck for me
It’s now saving me ~88 million tokens per day (and climbing toward 2B+ monthly)
I’ve fully open-sourced it under MIT so everyone can benefit
Would love your feedback if you give it a try! 🙏

@paulg That is brilliant!

@paulg Half the token bill sounds nice until you realize the real win is getting people to trust a third party with their prompts. Splitting the savings is clever, but the margin’s razor thin.

@paulg Has any dominant incumbent in one era managed to retain dominance in another?

@XTeamPal how does it work, bro? Can I use it in my Claude Code or other agents?

@paulg The 50% cut is real, but the bigger lever is upstream. Most enterprise LLM spend goes to requests that should never reach the model. Fix the routing, add caching, decompose the tasks, and costs drop before you optimise a single token.