1d ago

Epoch AI reports Claude strength in software engineering

0

Epoch AI Research aggregated benchmark results across frontier AI systems into domain-specific Effective Compute Indices. The analysis found the Claude family scoring 2.7 points higher on software engineering tasks than its overall ECI while scoring 1.8 points lower on math tasks. Claude models appeared stronger than competitors at software engineering and weaker at mathematics when normalized for general capability. A scatter plot illustrated the pattern across multiple Claude versions.

Original post

Claude is typically better at software engineering and worse at math than frontier competitors. Aggregating benchmarks to create our domain-specific ECI, we find the Claude family has an average SWE-ECI 2.7 points higher than their general ECI, and a Math-ECI 1.8 points lower.

11:07 AM · May 15, 2026 View on X
Reposted by

it's a chess benchmark bro

leela and stockfish have like 100M param networks to get to a 3600 rating

3:32 PM · May 16, 2026 · 30.2K Views
Epoch AI reports Claude strength in software engineering · Digg