Biggest problem in agentic coding?
The tool call tax! (No, it's not an EU regulation.馃槄)
Did you know... - It costs you *10x* more $$$ to generate 100k tokens that include T=30 tool calls vs. T=1 tool call at the end of your 1M context window? - Those 30 tool calls cost you *9x* more $$$ at the end of a 1M window vs. the same turn at the start of your window?
馃く Because every step that ends in a remote tool call stops cloud inference and resumes later -- however useless or quick the tool is to execute locally -- so on resuming, it's a minimum 10% tax that's calculated based on the work that's been done so far. (This generously assumes you get 90% discount on cached tokens.)
The more tool calls the bigger the multiplier! The later the tool calls, the bigger the penalty!