Teortaxes says GLM 5.2 approaches Claude Opus on hard evaluations but fails simple tasks Claude 3.5 Sonnet easily handles · Digg