8h ago

New Metric Effective Feedback Compute Predicts Agent Harness Scaling Laws

0
Original post

// Scaling Laws for Agent Harnesses // If you build agent harnesses, this one is worth your time. (bookmark it) Most harness tuning treats every token and tool call as if volume is all that counts. New research shows that most of it does not. The work introduces Effective Feedback Compute (EFC), a coordinate that counts only the feedback an agent can actually act on. Raw token and tool-call counts explain agent failure at R2 of 0.33 to 0.42. EFC pushes that to 0.99. Why does it matter? Once you budget by useful feedback instead of raw volume, reallocation alone lifts success from 0.27 to 0.90 at the same compute. This also turns harness design from guesswork into something you can predict. Paper: https://arxiv.org/abs/2605.29682 Learn to build effective AI agents in our academy: https://academy.dair.ai/

7:45 AM · May 29, 2026 View on X
Reposted by