Thanks for the support!
A small note: slime has supported not only OPD, but the full RL + OPD post-training workflow since GLM-4.5.
More to come for scalable agentic RL infra.
Incredible how Z. ai literally has their RL infrastructure open source.
The entire OPD post-training of GLM-5.2 took on this slime platform took ~2 days.
https://github.com/THUDM/slime












