🤗 GLM-5.2 is now live on @huggingface — supported by Novita.
Frontier-level coding and agent capabilities.
1M-token context window.
Built for autonomous agents and long-horizon coding assistants.
It improves speculative decoding acceptance length by 20 percent
🤗 GLM-5.2 is now live on @huggingface — supported by Novita.
Frontier-level coding and agent capabilities.
1M-token context window.
Built for autonomous agents and long-horizon coding assistants.
Many users are excited about GLM-5.2's 1M-token context window for enabling advanced agent workflows and long-horizon coding tasks, while a few criticize the website design and marketing approach.
No Digg Deeper questions have been answered for this story yet.

@jietang @ml_angelopoulos

@jietang really incredible work, I don't know why you guys didn't call it glm 6 at this point. unless you have something better training
We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include:
Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency Improved Architecture: We propose IndexShare, which reuses the same indexer across every four sparse attention layers, reducing per-token FLOPs by 2.9× at a 1M context length. We also improve GLM-5.2’s MTP layer for speculative decoding, increasing the acceptance length by up to 20% Pure Open: An MIT open-source license — no regional limits, technical access without borders Supporting long-horizon tasks starts with making long context engineering-usable: the model must maintain quality across long, messy coding-agent trajectories, not just accept more tokens. A 1M context is easy to claim, but much harder to keep reliable under real engineering pressure. To this end, we substantially expanded 1M-context training for coding-agent scenarios, covering large-scale implementation, automated research, performance optimization, and complex debugging. The result is a long-context system that is not only wide in scope, but solid in execution: a practical substrate for sustained engineering work.
This capability is reflected in GLM-5.2's performance on three long-horizon coding benchmarks. FrontierSWE measures whether an agent can complete open-ended technical projects at the scale of hours to tens of hours, spanning systems optimization, large-scale code construction, and applied ML research. On this benchmark, GLM-5.2 trails Opus 4.8 by only 1%, while edging out GPT-5.5 by 1% and Opus 4.7 by 11%. On PostTrainBench, where each agent is given an H100 GPU and evaluated by how much it can improve small models through post-training, GLM-5.2 outperforms both Opus 4.7 and GPT-5.5, ranking second only to Opus 4.8. On SWE-Marathon, an ultra-long-horizon software engineering benchmark covering tasks such as building compilers, optimizing kernels, and developing production-grade services, GLM-5.2 still has room to grow, trailing Opus 4.8 by 13% while remaining second only to the Opus series. Across all three benchmarks, GLM-5.2 is the highest-ranked open-source model, showing that its 1M context has translated into practical long-horizon delivery capability.

Explore it 👇 https://huggingface.co/zai-org/GLM-5.2

@jietang congrats!!

@jietang This is unreal

@novita_labs @huggingface GLM-5.2’s 1M context opens doors. Onchain agents need that kind of runway to run verifiable workflows. Quietly massive.

@jietang new cursor model just dropped

@jietang Very impressive!

@jietang 1M context for long-horizon is the spec i care about. curious how it holds up when the task actually spans the full window... most models fall apart past ~400k

@jietang glm-5.2 is more than just a model upgrade. it shifts the burden of long-horizon task planning from humans to algorithms, and that changes everything.

@goodhunt @ml_angelopoulos haha

@jietang 1m token context is nothing new after gemini, but the long-horizon task improvements could be significant if benchmarks hold up.

@jietang Thank you for all that hard work! what an amazing result.

@jietang For agents and robotics, the bottleneck isn't context length.
It's whether the model can follow a 50-step plan without losing track.
Does GLM-5.2 solve that, or is the improvement mainly buffer size?

@jietang It is working well for me so far! Great work!

@jietang Congratulations on the launch and great results on Posttrainbench, FrontierSWE.

@jietang 唐老师赞👍

@jietang congrats

@jietang @louszbd What a home run