LisanBench creator `@scaling01` argues identical benchmark scores overstate open-source model capabilities compared to closed models · Digg