9h ago

Benchmarks Show Open Source Models Fail To Deliver Immediate Cost Savings For Long-Horizon Agents

0
Original post

Curious finding while creating evals and benchmarks for long-horizon (100+ turn) agents While it’s generally thought that a direct swap to open source models can bring immediate cost savings, that’s not what we saw off the bat. Two factors play a major role 👇

7:22 AM · May 19, 2026 View on X
Reposted by