1d ago

Epoch AI reports Claude strength in software engineering

32588309295.3K

——0——

Epoch AI Research aggregated benchmark results across frontier AI systems into domain-specific Effective Compute Indices. The analysis found the Claude family scoring 2.7 points higher on software engineering tasks than its overall ECI while scoring 1.8 points lower on math tasks. Claude models appeared stronger than competitors at software engineering and weaker at mathematics when normalized for general capability. A scatter plot illustrated the pattern across multiple Claude versions.

Original post

#984@SCALING01 @EPOCHAIRESEARCH

Epoch AI@EPOCHAIRESEARCH

Claude is typically better at software engineering and worse at math than frontier competitors. Aggregating benchmarks to create our domain-specific ECI, we find the Claude family has an average SWE-ECI 2.7 points higher than their general ECI, and a Math-ECI 1.8 points lower.

11:07 AM · May 15, 2026

Cluster engagement

144 snapshots

Reposted by

#984@SCALING01

#464@YACINEMTB

QUOTE POST

#984Lisan al Gaib@SCALING01

it's a chess benchmark bro

leela and stockfish have like 100M param networks to get to a 3600 rating

3:32 PM · May 16, 2026 · 30.2K Views