Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
Internal test routed all company chats to the model for two days.
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
Many users praise Cursor's Composer 2.5 for its benchmark gains, low-cost efficiency, and ability to handle complex refactors like reading their mind, while a few dismiss it as trash or inferior to Opus.
Try Composer 2.5 on Cursor!
Composer 2.5 is now the most-chosen model in Cursor.
We're giving everyone 10x usage for the rest of the day. Enjoy!
Try it out!
(Partially trained on Colossus 2)
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
Try Composer 2.5
New CursorBench results just dropped.
Two big takeaways.
Composer 2.5 is way better than most people think.
63.2% score at $0.55 per task.
Nearly matching Opus 4.7 Max and GPT 5.5 Extra High at 20x less cost.
This is insane value.
Gemini 3.5 Flash is #10 at 49.8%.
Below GPT 5.5 Low. Below Opus 4.7 Low.
Google's newest model can't even beat budget tier competition.
Composer 2.5 is the sleeper.
Gemini 3.5 Flash is the disappointment.
Composer 2.5 is a significant step up from Composer 2.
This is the very start of our work with SpaceXAI. Hope to have more improvements out soon.
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
Composer 2.5 is now the most-chosen model in Cursor.
We're giving everyone 10x usage for the rest of the day. Enjoy!
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
the team did an internal test of this model last week
the whole company (bar a few exceptions) had all their cursor chats redirected to composer 2.5 for like 2 days.
i didn't even notice, which I think is testament to the progress of this model. go use it, its very good.
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
yeah that's pretty good
xAI might be able to cook with Cursor data + 10T model
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
We've gotten really really good at RL. Composer 2.5 is fighting well-above its weight class.
Very excited for the next release as we scale model sizes and FLOPs with @SpaceXAI!
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
Composer 2.5 is built on the same open-source base as Composer 2, Moonshot’s Kimi K2.5.
We improved Composer by scaling training, generating more complex RL environments, and introducing new learning methods.
For example, we use text feedback during RL to learn faster by assigning credit in rollouts spanning hundreds of thousands of tokens.
composer 1 was fast composer 2 was fast and intelligent composer N:
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
This is very bullish for SpaceXAI
Composer 2.5 is exceptionally intelligent and up to 10x more efficient than similarly capable models.
cursor is at frontier scale, both in terms of performance and compute
if composer 2.5's budget was put into a pre-train: ~6.3T total, 200B active trained on ~56T tokens
if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens.
assumptions are a lower bound: 35% MFU, FP8, ~3-4% sparsity like K2, H100 efficiency. model/token allocation is the mean between K2+K2.5 data point and Inclusion AI compute optimal rules for MoE
really impressed by the progression between composer 2 and composer 2.5
Introducing Composer 2.5, our most powerful model yet.
It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.
For the next week, we’re doubling the included usage of the model.
My first impressions of the new @cursor_ai Composer 2.5 model
+ pretty fast and efficient model + does a great job, i'd say it's almost as strong as opus 4.7 (or in some cases just at the same niveau) + cheap model + good at frontend - still a bit generic design when used without skills
I'll try to post a few results I got later :)
Been working on text feedback / OPSD in Composer. Really interesting space, and much more to be explored.
We improved Composer by scaling training, generating more complex RL environments, and introducing new learning methods.
For example, we use text feedback during RL to learn faster by assigning credit in rollouts spanning hundreds of thousands of tokens.
Intelligence too cheap to meter. This is the real deal. Composer 2.5 is an efficiency-beast
Huge, did NOT expect that release. Evals looks very solid, significant jump compared to composer 2!
But: it’s 10x more efficient than the competition. Looks really exciting. Need to try it out
i wrote a guide on optimizing context usage 6 months ago that i never posted. back then with the models available, you could only pick 2 of 3:
1. intelligent 2. fast 3. cheap
intelligent + fast = expensive fast + cheap = dumb cheap + intelligent = slow
now, with composer 2.5, this is no longer true and the post is obsolete. looking at TPS, avg cost per task, and score from cursorbench, it's clearly capable of all three
but benchmarks are just benchmarks. what matters is how it feels to use and if it can actually accomplish your tasks. from the feedback so far, that's very much the case
go try it out if you haven't already
Cursor's new Composer 2.5 takes third on the Artificial Analysis Coding Agent Index and is ~10-60x lower cost than the higher-effort Opus 4.7 and GPT-5.5 variants above it. This release puts Composer among the leading coding agent models, something that wasn’t clear for past releases
@cursor_ai has released Composer 2.5, the latest model in its Composer line. Composer 2.5 scored 62 on our Coding Agent Index, a 14 point gain over Composer 2 (48). This puts it in third place of our tested agents, behind only Claude Opus 4.7 (max) in Claude Code (66) and GPT-5.5 (xhigh reasoning) in Codex (65). These cost $4.10 and $4.82 per task respectively, ~10x the cost of Composer 2.5 Fast ($0.44) and ~60x the cost of Composer 2.5 standard ($0.07).
Key results for Composer 2.5 in Cursor CLI:
➤ Cost-quality Pareto frontier: At $0.07 (standard) and $0.44 (Fast) per task, Composer 2.5 is cheaper than every other agent scoring above 60 on the Index. Medium-effort peers cost $1.24–$2.21 per task; higher-effort variants land 3-4 points above at $4.10–$4.82
➤ Per-benchmark gains vs Composer 2: +35 points on SWE-Bench-Pro-Hard-AA (12% → 47%), +2 points on Terminal-Bench v2 (64% → 66%), and +3 points on SWE-Atlas-QnA (69% → 72%). At 47%, Composer 2.5's score on SWE-Bench-Pro-Hard-AA is comparable to Claude Opus 4.7 (max) in Claude Code
➤ Among the fastest coding agents: Composer 2.5 Fast runs at an average wall time of 6.7 minutes per task, the third-fastest agent on the Artificial Analysis Coding Agent Index, behind only Claude Opus 4.7 (medium) in Claude Code (5.8m) and GPT-5.5 (medium) in Cursor CLI (6.2m)
➤ Fast mode enables better responsiveness at 6x pricing: Fast runs 30% faster than standard Composer 2.5, but is ~6x the cost per task ($0.44 vs $0.07). Token pricing is 6x higher for Fast: $3.00/$15.00 vs $0.50/$2.50 per million input/output tokens
Model details:
➤ Base model: Continued training on @Kimi_Moonshot's open weights Kimi K2.5 as with Composer 2, with Cursor reporting ~85% of total compute from its own additional training and reinforcement learning
➤ Pricing: $0.50/$2.50 per million input/output tokens for the standard variant; $3.00/$15.00 for the Fast variant (the default in Cursor)
➤ Available exclusively in Cursor: both Cursor IDE and Cursor CLI, an externally accessible API is not available
Congratulations @cursor_ai and @mntruell on the impressive release!
i just uninstalled cursor literally 27 days ago and now i gotta reinstall it and resubscribe already cause they released an insane model of their own. i did not see this one coming

composer 2.5 feels unlimited, insanely fast too. went from using 3 models to just one.
congrats @cursor_ai 🖤

@beffjezos The trend is strong

Learn more about Composer 2.5: http://cursor.com/blog/composer-2-5