Open weight models continue to nip at the heels of proprietary models. The claims that frontier models are pulling away seem thin to me. And given the timing, hard to see how GLM-5.2 could have substantially benefited from distillation of Fable.
Everyone benchmarks GLM-5.2 against the frontier now. So we did too.
We pulled GLM-5.2's plan up against Claude Fable 5's, the plan that won our last frontier round. Same prompt, same task, same rubric.
Fable scored 9.1. GLM-5.2 scored 9.0.


