Has anyone done any speculation on the training recipe of GLM 5.2? Beyond extensive RL, we know it's (at least?) a new midtrain ("GLM-5.2 is trained with IndexShare from mid-training with 128K sequence length") with arch changes.
Pleias CTO Pierre-Carl Langlais argues GLM 5.2 includes architectural updates alongside its 128K IndexShare mid-training, which Elie Bakouch disputes
The training recipe also features extensive reinforcement learning.
Users welcome speculated GLM-5.2 mid-training changes as better news for the open ecosystem because they could shortcut expensive data-building capabilities through synthetic generation.
No Digg Deeper questions have been answered for this story yet.
Most Activity

actually, along with other closed labs signals, doesn't seem great news for custom rl env sellers.

@yacineMTB (likely through the OPD recipe, since they mention it, "efficiently merging more than ten expert models into the final model")

Based on the bouba shape, my guess would be hard synth/rl env scaling with recursive generative design+eval.
@Dorialexander (except index sharing ok but they already published this paper and just didn't need it for smaller context)
@Dorialexander there is no arch changes?
@Dorialexander there is no arch changes?
Has anyone done any speculation on the training recipe of GLM 5.2? Beyond extensive RL, we know it's (at least?) a new midtrain ("GLM-5.2 is trained with IndexShare from mid-training with 128K sequence length") with arch changes.

@yacineMTB yeah and diversity/combinations.

@Dorialexander when you say RL env scaling, you mean total volume of RL envs right?

and, conversely, much better news for the open ecosystem that can maybe shortcut a billion-dollars data building capability by generating it all. though you'll still need hard skills.

@Dorialexander I think they explained some of the stuff in their paper on training 5.0

@ChuhaiDev ok you were right.
@Dorialexander ok i was confused bc you said
> we know it's a new midtrain with arch changes
@eliebakouch None that I have seen. Param count identical.

@ChuhaiDev yeah but doesn't really explain the sudden take-off.

@Dorialexander Anthropic will let us know soon through another blog post.

@Dorialexander indexshare at 128K in midtrain points to multi-document packing so the model picks up cross-doc signals early.
@eliebakouch ah yeah meant light arch changes.
@Dorialexander ok i was confused bc you said
> we know it's a new midtrain with arch changes

@Dorialexander @yacineMTB MAI combined 3 experts into final model via SFT (which I was like, no way this would work? with my limited knowledge I expected OPD to be more robust)
I'm wondering what were expert objectives/curriculums for 5.2. Lot of expert models

@Dorialexander @ChuhaiDev Lol

@eliebakouch None that I have seen. Param count identical.

@Dorialexander I keep wondering how much comes from the midtraining vs the RL phase. Without ablations it's hard to know what's actually moving the needle.