I predict 48 for GLM 5.2
AI commentator Teortaxes predicts unreleased GLM 5.2 will score 48 on the Artificial Analysis Intelligence Index
Current GLM-5 and GLM-5.1 models both score 40.
Positive users predict GLM 5.2 will outperform Gemini and other models based on high intelligence scores, while negative users dismiss the evaluations as unreliable or note its limits on personal tests.
No Digg Deeper questions have been answered for this story yet.
Most Activity
@teortaxesTex will be ahead of gemini
I predict 48 for GLM 5.2
this is a careful guess and not a claim that it's "worse than Gemini 3.5 Flash". I am weighing the actual scores here. It'll probably be way more useful than Gemini.
I predict 48 for GLM 5.2
@zephyr_z9 I think there are a few evals here that will drag it odwn
Full Intelligence Index v4.1 weights: ➤ GDPval-AA v2: 20% ➤ Terminal-Bench 2.1: 16% ➤ τ³-Bench Banking: 14% ➤ Humanity's Last Exam: 12% ➤ AA-Omniscience Accuracy: 8% ➤ SciCode: 8% ➤ GPQA: 6% ➤ AA-LCR: 6% ➤ CritPt: 6% ➤ AA-Omniscience Non-Hallucination: 4%
Full per-model breakdowns below:

@teortaxesTex I’d say 52

@teortaxesTex solves 1/10 problems on my personal eval v4 expert solves 7/10
looks like its not really a general purpose model

@zephyr_z9 @teortaxesTex ive not looked into the AA gpdval, but i've heard a few credible people say its slop.
what's the specific issue?

@teortaxesTex my guess - 50-54

@teortaxesTex tbh gdpval is slop

@teortaxesTex ~52

@zephyr_z9 you can recompute the index without it I'm predicting for this mixture

@teortaxesTex I predict 50

@teortaxesTex Im going 51

@teortaxesTex THEY SHOULD KNOLEDGE MAXX INSZEAD OF NON HALLU. SAME WOTH MINIMAX. ONLY TGE WHALE

@teortaxesTex 51

@teortaxesTex 52+?

@teortaxesTex I predict it will be 45 and will become the most intelligent open-source model.

@teortaxesTex 51

@teortaxesTex yeah 48 seems right.. waiting on AA

@teortaxesTex 46 my guess

@teortaxesTex 50+