9h ago

Benchmarks Show Open Source Models Fail To Deliver Immediate Cost Savings For Long-Horizon Agents

84272613.8K

——0——

Original post

Curious finding while creating evals and benchmarks for long-horizon (100+ turn) agents While it’s generally thought that a direct swap to open source models can bring immediate cost savings, that’s not what we saw off the bat. Two factors play a major role 👇

7:22 AM · May 19, 2026

Reposted by

#739@HWCHASE17

Benchmarks Show Open Source Models Fail To Deliver Immediate Cost Savings For Long-Horizon Agents

Sentiment

Cluster engagement