Serena launches DeepSWE, a long-horizon software engineering benchmark that tests AI coding agents on complex, high-output developer tasks · Digg