SWE-bench creator John Yang opens public submissions for ProgramBench, which tests whether language models can rebuild programs from scratch · Digg