AI developer Teortaxes questions the validity of the prinzbench AI benchmark after GLM-5.2 scores a low 30 out of 99 · Digg