/Tech6h ago

GLM-5.2 launches on Hugging Face with a 1-million-token context window and architecture that cuts FLOPs by 2.9 times

It improves speculative decoding acceptance length by 20 percent

6061855719.1K

#109

Original post

Novita AI@novita_labs

🤗 GLM-5.2 is now live on @huggingface — supported by Novita.

Frontier-level coding and agent capabilities.

1M-token context window.

Built for autonomous agents and long-horizon coding assistants.

11:19 AM · Jun 16, 2026 · 3.3K Views

Sentiment

Many users are excited about GLM-5.2's 1M-token context window for enabling advanced agent workflows and long-horizon coding tasks, while a few criticize the website design and marketing approach.

Pos

96.4%

Neg

3.6%

30 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS129REPLIES1

Hunter Bown@goodhunt

@jietang @ml_angelopoulos

1h1292

LIKES3

Mad ML scientist@HououinTyouma

@jietang really incredible work, I don't know why you guys didn't call it glm 6 at this point. unless you have something better training

52m763

RETWEETS3

jietang@jietang

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include:

Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency Improved Architecture: We propose IndexShare, which reuses the same indexer across every four sparse attention layers, reducing per-token FLOPs by 2.9× at a 1M context length. We also improve GLM-5.2’s MTP layer for speculative decoding, increasing the acceptance length by up to 20% Pure Open: An MIT open-source license — no regional limits, technical access without borders Supporting long-horizon tasks starts with making long context engineering-usable: the model must maintain quality across long, messy coding-agent trajectories, not just accept more tokens. A 1M context is easy to claim, but much harder to keep reliable under real engineering pressure. To this end, we substantially expanded 1M-context training for coding-agent scenarios, covering large-scale implementation, automated research, performance optimization, and complex debugging. The result is a long-context system that is not only wide in scope, but solid in execution: a practical substrate for sustained engineering work.

This capability is reflected in GLM-5.2's performance on three long-horizon coding benchmarks. FrontierSWE measures whether an agent can complete open-ended technical projects at the scale of hours to tens of hours, spanning systems optimization, large-scale code construction, and applied ML research. On this benchmark, GLM-5.2 trails Opus 4.8 by only 1%, while edging out GPT-5.5 by 1% and Opus 4.7 by 11%. On PostTrainBench, where each agent is given an H100 GPU and evaluated by how much it can improve small models through post-training, GLM-5.2 outperforms both Opus 4.7 and GPT-5.5, ranking second only to Opus 4.8. On SWE-Marathon, an ultra-long-horizon software engineering benchmark covering tasks such as building compilers, optimizing kernels, and developing production-grade services, GLM-5.2 still has room to grow, trailing Opus 4.8 by 13% while remaining second only to the Opus series. Across all three benchmarks, GLM-5.2 is the highest-ranked open-source model, showing that its 1M context has translated into practical long-horizon delivery capability.

1h18.3K63959

Novita AI@novita_labs

Explore it 👇 https://huggingface.co/zai-org/GLM-5.2

6h1091

jacky@jjacky

@jietang congrats!!

1h411

ily⚡️@0xIlyy

@jietang This is unreal

1h271

THΞGABO🍌@thegaboeth

@novita_labs @huggingface GLM-5.2’s 1M context opens doors. Onchain agents need that kind of runway to run verifiable workflows. Quietly massive.

6h79

crashout@0xCRASHOUT

@jietang new cursor model just dropped

1h51

Brendan Graham@brendanigraham

@jietang Very impressive!

1h51

Samian@ApplyWiseAi

@jietang 1M context for long-horizon is the spec i care about. curious how it holds up when the task actually spans the full window... most models fall apart past ~400k

48m47

Adel Bucetta@adelbucetta

@jietang glm-5.2 is more than just a model upgrade. it shifts the burden of long-horizon task planning from humans to algorithms, and that changes everything.

1h45