LoopCoder-v2 7B code model achieves 64.4 on SWE-bench Verified with two inference loops, but more loops degrade performance · Digg