20h ago

Anthropic Opus Models Reach 38.4% On SWE-Bench Multimodal

——0——
Original post
John YangJY#854@JYANGBALLINOPKilian LieretKLKilian Lieret|@KLIERET

SWE-bench multimodal is still very hard (numbers from Anthropic system cards)

3:09 PM · May 28, 2026 View on X
020214.6K

Cluster engagement

29 snapshots