GLM-5.2 leads open weights models and sits at #3 overall on GDPval-AA, a real-world agentic work benchmark
GLM-5.2 from @Zai_org scores 1524 Elo on GDPval-AA, which measures performance on real-world, economically valuable knowledge work through long-horizon, multi-turn tasks.
Key takeaways:
➤ #3 overall, behind only Claude Fable 5 (1783) and Claude Opus 4.8 (1615), and level with GPT-5.5 (xhigh, 1509)
➤ The leading open weights model by a wide margin: the next open model, MiniMax-M3, scores 1408
➤ Ahead of many proprietary models, including Google's Gemini 3.5 Flash (1357), Qwen 3.7 Max (1289), Muse Spark (1158)
➤ The tasks are agentic. GLM-5.2 averaged ~31 turns per task across 1,999 matches
➤ Consistent with the rest of its launch, GLM-5.2 also leads open weights on the Artificial Analysis Intelligence Index, ranks #3 on the Agentic Index, and #3 on AA-Briefcase













