This is good news, but the fact that Composer 2.5 comes out on top over GLM 5.2 here should trigger a serious re-evaluation of the entire benchmark. It's nowhere near in practice except basic tasks... and that undermines trust in Cursor's entire model evaluation / publicity.
You can now try GLM 5.2 in Cursor!
Excited to see more useful open models, thank you to Fireworks for partnering here. Results from our evals ↓