/Tech5h ago

Zhipu GLM 5.2 Demonstrates Early Recursive Self-Improvement in Post-Training

47341411.3K

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

One more detail is that this release points to a discontinuous maturation in Zhipu's post-training stack. Yeah distilled from Claude, whatever, it served as a seed dataset. But you don't get those CritPT/Posttrain-Bench/WeirdML etc. results just by finetuning on claudisms. They now know how to build agents which synthesize their own diverse training environments and get stronger. They are entering the early stage of RSI. The ceiling of this approach is very far away, they can have steadily improving GLM-5s every 2 months until the end of the year without doing anything new. I'm pretty certain that the next one will be stronger across the board than Opus 4.8 (maybe modulo some holdovers like WeirdML). The gap may be stable, or modestly decrease, or increase if you count Fable/Mythos, but there is no slowdown in open weights capabilities progress, and here we clearly have something that's *at least* on "Opus 4.55" or "GPT 5.4" level. Opus 4.5 was already a paradigm shift (and I'd argue that GPT 5.2 was a bigger one). We have that on huggingface now. It can help build more of itselves. Make of that what you will.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Zvi on GLM 5.2. Mostly correct. One detail he underrates (again) is that model layer and hardware layer are distinct. 5.2's niche is, ironically, larger than what http://Z.AI can serve as a product. On B300s, we could run it *faster* and cheaper than Gemini-Flash.

8:20 AM · Jun 22, 2026 · 9.9K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Z.AIVia

#501

Posts from X

Most Activity

VIEWS1.4KBOOKMARKS1LIKES9RETWEETS1

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

or as said at @interconnectsai

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

3h1.4K91

REPLIES1

evrazian_schizo@rationaleist

@teortaxesTex Evidently even last gen 700B models are far from saturated and at this point every good checkpoint is an asset for developing the next one. The absolute moat is still ephemeral for now

4h1K9

evrazian_schizo@rationaleist

@teortaxesTex I think the biggest artifact here isn't even the model but that they did it with better post-training alone. Unless we assume that fixed indexer sharing imposes benign structure, it's virtually the same as V3.2 and had less wall clock training time than K2.7

4h834

evrazian_schizo@rationaleist

@teortaxesTex Zhipu could take the more straightforward Whale and Moonshot innovations, scale them to what they can handle, and apply the post-training framework they already have. It's also possible for DS and Kimi to step up their RL and iteration speed

4h553