SWE-bench co-creator John Yang launches two simulator-backed CodeClash arenas to evaluate AI agents on logistics and cybersecurity · Digg