Up until yesterday, our entire MTS team has operated under the philosophy of tokenmaxxing as much as possible on Claude Max plans.
With Fable, this may no longer be possible: - One of our team members hit his limit 3 times yesterday and used the equivalent of $1.5k in 10 hours - Half of our team has hit quota limits on eng work
This era of tokenmaxxing may need to be restrained - or at least have clear guardrails defined. We are concerned about running Fable at API-based billing. If every engineer starts spending tokens at levels equivalent to headcount costs, our burn rate will meaningfully increase.
Just as startups are starting to bake model routing into their core product, we will have to start thinking about model routing in our core engineering usage.















