2h ago

GPT-5.5 Leads Updated SWE-rebench With 62.7% Task Resolution

0
Original post

first update: Claude Opus 4.8 – xhigh on march-may 110 tasks: 56.4% gpt-5-xhigh: 62.7% – $2.25 gpt-5.5-medium: 58.9% – $0.98 Opus 4.8 - xhigh: 56.4% – $2.02 Opus 4.7 – high: 53.1% – $1.32 Opus 4.6 - high: 47.8% – $1.29 more open-weight models to come in ~1-2 weeks

1:27 PM · May 29, 2026 View on X
GPT-5.5 Leads Updated SWE-rebench With 62.7% Task Resolution · Digg