/Tech3h ago

Critic Questions Cursor GLM 5.2 Benchmark After Composer 2.5 Tops Evals

1000837

Original post

This is good news, but the fact that Composer 2.5 comes out on top over GLM 5.2 here should trigger a serious re-evaluation of the entire benchmark. It's nowhere near in practice except basic tasks... and that undermines trust in Cursor's entire model evaluation / publicity.

Lee Robinson@leerob

You can now try GLM 5.2 in Cursor!

Excited to see more useful open models, thank you to Fireworks for partnering here. Results from our evals ↓

9:56 PM · Jun 24, 2026 · 578 Views

Sentiment

Users are sarcastically mocking Cursor for allegedly inflating private benchmarks like GLM 5.2 after Composer 2.5 topped evals, using puns about benchmaxxing.

Pos

0.0%

Neg

100.0%

1 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

Alex J. Champandard 🌻@alexjc

Cursor: What do you call benchmaxxing your own private benchmarks?

Nobody: maxturbenchion

Alex J. Champandard 🌻@alexjc

2h13600