Zhipu AI releases GLM-5.2, an open-weights model with a 1M context window that beats Claude on coding leaderboards · Digg

/Tech2h ago

Zhipu AI releases GLM-5.2, an open-weights model with a 1M context window that beats Claude on coding leaderboards

Story Overview

Zhipu AI dropped GLM-5.2 on June 13 with full MIT-licensed open weights now live on Hugging Face, featuring a claimed usable 1M-token context and immediate availability through its GLM Coding Plan tiers for tasks that stretch across hours of agentic work.

3743.9K131798471.8K

Original post

Andrew Curran@AndrewCurran_#682inTech

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena.

With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.

And it's open weights.

This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.

Huge congratulations to the @Zai_org on the release!

11:20 AM · Jun 16, 2026 · 3.7K Views

Developer Impact

MIT license lets anyone run it locally

The 744B-parameter MoE model downloads freely and runs via vLLM or Transformers, while API pricing lands at $1.40 per million input tokens, positioning it as a cheaper alternative to closed coding suites.

Open Question

Arena leads rest on crowdsourced scores

Design Arena shows GLM-5.2 at 1360 Elo in code categories ahead of several Claude variants, yet independent third-party benchmarks remain absent at launch so the gap on long-horizon suites stays unverified outside company and arena data.

Sentiment

Many users praised GLM-5.2 topping the Design Arena Code Leaderboard at 1360 Elo as proof open-weights models can lead closed ones, while others called the results fake or useless in practice.

Pos

77.7%

Neg

22.3%

81 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Related links

Z.AIVia

zai-org/GLM-5.2 · Hugging Face

HUGGINGFACE

Posts from X

Most Activity

VIEWS47.3KBOOKMARKS79LIKES307

Zephyr@zephyr_z9

Absolutely fucking crazy

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena.

With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.

And it's open weights.

This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.

Huge congratulations to the @Zai_org on the release!

2h47.3K30779

RETWEETS53

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena.

With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.

And it's open weights.

This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.

Huge congratulations to the @Zai_org on the release!

3h286.9K2K537

REPLIES8

Nathan Lambert@natolambert

It's hard to pinpoint open-closed gap and so-on, but I trust the @arena team and just look where GLM 5.2 is on this. An MIT licensed, to be open weight model. At this point you could argue they have a better agent than Gemini does. That's a serious accomplishment.

1h7.9K12616

Nathan Lambert@natolambert

Still hard to expect the unexpected with AI. It goes to show how skilled many of the scientists are in China. They're hitting high peaks with much less compute.

Overall, I think the US models are really ahead, but you can't just discount the Chinese labs. Not at all.

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena.

With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.

And it's open weights.

This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.

Huge congratulations to the @Zai_org on the release!

1h9.7K16918

Lisan al Gaib@scaling01

this is pretty insane and that's only 744B

imagine a properly RL'd DeepSeek-V4-Pro

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena.

With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.

And it's open weights.

This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.

Huge congratulations to the @Zai_org on the release!

1h7.1K1499

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

I don't trust arenas much. Waiting for AA scores I get the feeling this is the real one. Actual, solid, all-around excellent model, genuine frontier class

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena.

With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.

And it's open weights.

This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.

Huge congratulations to the @Zai_org on the release!

1h4.8K1077

elvis@omarsar0

Impressive if true!

Better than Claude Fable 5? Wow!

Design is really lacking in these frontier models, so I'm very curious to test GLM-5.2 myself.

Testing this already on a few internal use cases and will report back on findings.

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena.

With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.

And it's open weights.

This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.

Huge congratulations to the @Zai_org on the release!

1h5K4014

Zephyr@zephyr_z9

Looks Like GLM 5.2 was the real Le Chatton Fat

Zephyr@zephyr_z9

Absolutely fucking crazy

2h11.1K865

elvis@omarsar0

Looks strong at SWE too.

Proximal@ProximalHQ

GLM 5.2 ranks #3 on FrontierSWE. It is only behind Fable 5 and Opus 4.8, and it outperforms GPT-5.5.

This is the first model that closes the large gap between models from Anthropic / OpenAI and other providers, and it is the strongest open-weight model by far.

1h2.4K66

GLM-5.2 (Max) by @Zai_org ranks #10 on the new Agent Arena leaderboard, closely matching Claude-Opus-4.8 (non-thinking) and is the #1 open model by a wide margin!

In Agent Arena, we measure models on millions of real-world, long-horizon agentic tasks from a global community of users. Models can access web search, filesystem, and terminal tools to complete complex workflows. The leaderboard measures model performance on outcomes relative to the average model using a causal tracing methodology.

Compared to 5.1, GLM-5.2 (Max) climbs from #13 to #10. Its clearest gains are confirmed task success, and user praise vs. complaint. Bash capabilities and tool hallucination remain stable. There is a tradeoff in steerability compared to the previous model (-6.0% vs. +1.2%).

GLM-5.2 remains the same price as GLM-5.1, $1.4/$4.4 per input/output MTokens. 1M context window.

Huge congrats @Zai_org for the incredible release!

See thread for details on how GLM-5.2 (Max) performs across 5 different signals.

Z.ai@Zai_org

Introducing GLM-5.2: Frontier Intelligence, Open Weights

- Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency - MIT-licensed open weights - Same API pricing as GLM-5.1

Tech Blog: http://z.ai/blog/glm-5.2 Weights: http://huggingface.co/zai-org/GLM-5.2 API: http://docs.z.ai/guides/llm/glm-5.2 Coding Plan: http://z.ai/subscribe Chat: http://chat.z.ai

3h49.2K33945

Exciting news: GLM-5.2 (Max) ranks #2 in Code Arena: Frontend, with +29pt over Claude Opus 4.7 (Thinking) and only behind Fable 5! GLM-5.2 is the best open model vs Kimi-K2.6 and Minimax-M3 by a large margin.

- #2 React and #4 HTML sub-leaderboards - Ranks as the top model in nearly all sub categories: Brand & Marketing, Reference-Based Design, Data & Analytics, Consumer Product, Gaming, and Simulations.

Congrats @Zai_org for the incredible milestone!

GLM-5.2 (Max) by @Zai_org ranks #10 on the new Agent Arena leaderboard, closely matching Claude-Opus-4.8 (non-thinking) and is the #1 open model by a wide margin!

In Agent Arena, we measure models on millions of real-world, long-horizon agentic tasks from a global community of users. Models can access web search, filesystem, and terminal tools to complete complex workflows. The leaderboard measures model performance on outcomes relative to the average model using a causal tracing methodology.

Compared to 5.1, GLM-5.2 (Max) climbs from #13 to #10. Its clearest gains are confirmed task success, and user praise vs. complaint. Bash capabilities and tool hallucination remain stable. There is a tradeoff in steerability compared to the previous model (-6.0% vs. +1.2%).

GLM-5.2 remains the same price as GLM-5.1, $1.4/$4.4 per input/output MTokens. 1M context window.

Huge congrats @Zai_org for the incredible release!

See thread for details on how GLM-5.2 (Max) performs across 5 different signals.

2h37.3K56489

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

> sonnet lol

RAY@3xcalibaneur

Every 3 weeks, some “open source” glup shitto model pops into existence from nowhere, scores a bunch of big numbers on the goybenches, and then everyone forgets them in a day because they’re always palpably worse than like, Claude Sonnet Medium. Who even cares?

38m1.4K162

Design Arena@Designarena

Try it now on http://DesignArena.ai

3h2K143

elvis@omarsar0

Damn! That looks nice.

Exciting news: GLM-5.2 (Max) ranks #2 in Code Arena: Frontend, with +29pt over Claude Opus 4.7 (Thinking) and only behind Fable 5! GLM-5.2 is the best open model vs Kimi-K2.6 and Minimax-M3 by a large margin.

- #2 React and #4 HTML sub-leaderboards - Ranks as the top model in nearly all sub categories: Brand & Marketing, Reference-Based Design, Data & Analytics, Consumer Product, Gaming, and Simulations.

Congrats @Zai_org for the incredible milestone!

1h1.2K54

Anastasios Nikolas Angelopoulos@ml_angelopoulos

Just to be clear, if you remove Fable which is unavaialble, GLM-5.2 (Max) is the #1 model in the world for frontend coding.

This is a huge moment. OSS has caught up with proprietary, and China has caught up with the US, in this very important domain.

Exciting news: GLM-5.2 (Max) ranks #2 in Code Arena: Frontend, with +29pt over Claude Opus 4.7 (Thinking) and only behind Fable 5! GLM-5.2 is the best open model vs Kimi-K2.6 and Minimax-M3 by a large margin.

- #2 React and #4 HTML sub-leaderboards - Ranks as the top model in nearly all sub categories: Brand & Marketing, Reference-Based Design, Data & Analytics, Consumer Product, Gaming, and Simulations.

Congrats @Zai_org for the incredible milestone!

1h642131

GLM-5.2 (Max) is #25 overall in the Text Arena, similar to GLM-5.1. Diving deeper into the comparison in Text, we can see that though GLM-5.2’s rank is down overall, its largest gains are across: - Sub-categories: Expert Arena, Multi-Turn - Occupational categories: such as: Life, Physical & Social Science, Creative Writing and Medicine & Healthcare.

2h1.6K101

GLM-5.2 ranks as the best open model in Code Arena: Frontend.

In the Code Arena: Frontend, models are evaluated on agentic frontend coding tasks from real users building apps and websites (HTML and React).

2h87917

GLM-5.2 (Max) ranks #10 overall (+4.4%) - tied for #1 Tool Hallucination (+1.9%) - #3 Confirmed Task Success (+9.4%) - #3 Praise vs. Complaint (+14.9%) - #16 Bash Recovery (+1.7%) - #20 Steerability (-6.0%)

3h1.4K18

Learn more about the causal tracing methodology for Agent Arena on our blog: http://arena.ai/blog/agent-arena-methodology

3h1.5K61

Lisan al Gaib@scaling01

@natolambert @arena it's about where you would expect it with a 6 month backward looking gap, still short of Opus 4.6

Nathan Lambert@natolambert

It's hard to pinpoint open-closed gap and so-on, but I trust the @arena team and just look where GLM 5.2 is on this. An MIT licensed, to be open weight model. At this point you could argue they have a better agent than Gemini does. That's a serious accomplishment.

1h627120