LisanBench runner `@scaling01` queries creator `@swyx` on the source of data used in a METR Evals discussion · Digg