Rishi Desai launches SWE-Marathon, a long-horizon coding benchmark where Claude Opus 4.8 leads with a 26% score · Digg