5h ago

Claude Opus 4.8 Ranks Second On APEX-Agents Benchmark

0
Original post

Claude Opus 4.8 (Max) places 2nd on APEX-Agents It scores 42.5% Pass@1, behind Gemini 3.5 Flash (49.6%) and ahead of GPT 5.5 (38.4%). @claudeai Opus 4.8 has improved 8.6 percentage points over Opus 4.7 (33.9%). In ~6 months, Opus models have improved from 18.3% to 42.5%.

APEX-Agents | Opus progress over time
10:25 AM · May 29, 2026 View on X