5h ago

SemiAnalysis posted data from 174,264 agentic coding sessions showing 42% of runtime on CPU tasks versus 58% on GPU inference and highlighted cloud pricing mismatches with per-token monetization

Median per-turn time measured 5.13 seconds.

0
Original post

FACT ALERT 🚨 : In modern agentic coding, 42% of the time is spent on CPU doing tool use such as editing files, running Bash scripts, running lints, etc. The economy of traditional cloud computing charges at $ per cpu core. In the economy of agents, the business model is $ per token thus to increase token revenue, you need to increase the amount of CPUs power u have so that you can generate your tokens.

7:00 AM · May 23, 2026 View on X

Very important.

SemiAnalysisSemiAnalysis@SemiAnalysis_

FACT ALERT 🚨 : In modern agentic coding, 42% of the time is spent on CPU doing tool use such as editing files, running Bash scripts, running lints, etc. The economy of traditional cloud computing charges at $ per cpu core. In the economy of agents, the business model is $ per token thus to increase token revenue, you need to increase the amount of CPUs power u have so that you can generate your tokens.

2:00 PM · May 23, 2026 · 73K Views
3:03 PM · May 23, 2026 · 13K Views

Here is one big reason why this matters. Time spent on non-LLM inference tasks is only going to increase. However, tools that these AI system use are *very* inefficient and have been built from the ground up for CPU and human use. There is a huge untapped opportunity there to significantly improve those processes with AI agents in mind from the ground up.

SemiAnalysisSemiAnalysis@SemiAnalysis_

FACT ALERT 🚨 : In modern agentic coding, 42% of the time is spent on CPU doing tool use such as editing files, running Bash scripts, running lints, etc. The economy of traditional cloud computing charges at $ per cpu core. In the economy of agents, the business model is $ per token thus to increase token revenue, you need to increase the amount of CPUs power u have so that you can generate your tokens.

2:00 PM · May 23, 2026 · 73K Views
7:32 PM · May 23, 2026 · 881 Views

Here is one big reason why this matters. Time spent on non-LLM inference time is only going to increase. However, tools that these AI system use are *very* inefficient and have been built from the ground up for CPU and human use. There is a huge untapped opportunity there to significantly improve those processes with AI agents in mind from the ground up.

SemiAnalysisSemiAnalysis@SemiAnalysis_

FACT ALERT 🚨 : In modern agentic coding, 42% of the time is spent on CPU doing tool use such as editing files, running Bash scripts, running lints, etc. The economy of traditional cloud computing charges at $ per cpu core. In the economy of agents, the business model is $ per token thus to increase token revenue, you need to increase the amount of CPUs power u have so that you can generate your tokens.

2:00 PM · May 23, 2026 · 73K Views
7:11 PM · May 23, 2026 · 1.5K Views