Qwen 3.7 Plus outperforms GPT-5.4 and Claude-Opus-4.6 across 12 reasoning tests on LisanBench · Digg