Cursor releases Composer 2.5, built on Kimi K2.5, and says it is better at sustained work on long-running tasks and follows complex instructions more reliably

Composer 2.5 is now available in Cursor. — It's a substantial improvement in intelligence and behavior over Composer 2.

Cursor 2026-05-19

Context & Ripple Effects

Cursor introduced Composer 2 as a coding-focused agent for autonomous, lengthy tasks, with usage-based pricing and a higher-priced faster variant. Subsequent coverage established that Composer 2 began from Kimi K2.5 via Fireworks AI, after Cursor acknowledged it should have disclosed that base more clearly.

Composer 2.5 extends that model line with claims centered on task persistence and instruction-following—two capabilities that matter most when coding agents are asked to operate over larger, multi-step projects rather than isolated prompts.

First-order effects

Cursor users gain a new Composer version positioned for longer-running coding work and more complex workflows.
Cursor further ties the Composer product line to Kimi K2.5, making the quality and availability of that underlying model consequential to Composer’s differentiation.

Second-order effects

Cursor must substantiate reliability gains in real development workflows, particularly given related criticism around AI-assisted coding reliability and its own warning about fragile “vibe coding” foundations.
Rival coding-agent providers face more pressure to compete on dependable multi-step execution, not only benchmark-level coding ability or raw model speed.

Third-order effects

If sustained-task reliability improves across coding agents, the competitive boundary may shift from single-model capability toward the surrounding agent system: task management, verification, human review, and integration into development workflows.
The episode also reinforces a supply-chain dynamic in AI products: application vendors can differentiate on product behavior while depending on upstream model providers and inference partners, increasing the importance of clear attribution and resilient access.

The trend: AI coding tools are moving from code-generation assistants toward agents judged by whether they can complete and safely sustain multi-step work over long-lived software projects.

Discussion

@cursor_ai @cursor_ai on x
Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we're doubling the included usage of the model. [image]
@cursor_ai @cursor_ai on x
Together with SpaceXAI, we're training a significantly larger model from scratch, using 10x more total compute. With Colossus 2's million H100-equivalents and our combined data and training techniques, we expect this to be a major leap in model capability.
@mil000 Milo Smith on x
Introducing [rebranded Chinese model] Its more intelligent (no shit) We are doubling usage forever. Hell, next week we might even triple it.
@evisdrenova Evis Drenova on x
The cursor fall-off is going to be studied for decades. I don't know any engineer who uses them anymore. Not to say that others don't, but it's obvious that they're no longer on the tech frontier. Still, a $60b outcome in 4 years is nothing to sneeze at... [image]
@cursor_ai @cursor_ai on x
We improved Composer by scaling training, generating more complex RL environments, and introducing new learning methods. For example, we use text feedback during RL to learn faster by assigning credit in rollouts spanning hundreds of thousands of tokens.
@grad62304977 @grad62304977 on x
Cool to see work on this although I think many misunderstood the main work here Making credit assignment work is mainly held back by the fact that in the objective to solve a task, it's rare to have intermediate parts of a rollout that are verifiable directly There is no correct
@kimmonismus @kimmonismus on x
Huge, did NOT expect that release. Evals looks very solid, significant jump compared to composer 2! But: it's 10x more efficient than the competition. Looks really exciting. Need to try it out [image]
@danperks Dan Perks on x
the team did an internal test of this model last week the whole company (bar a few exceptions) had all their cursor chats redirected to composer 2.5 for like 2 days. i didn't even notice, which I think is testament to the progress of this model. go use it, its very good.
@teknium @teknium on x
@elonmusk Is it accessible through the SuperGrok subscription too?
@togethercompute @togethercompute on x
Congrats to the @cursor_ai team on Composer 2.5 — a huge milestone for agentic coding models. Together AI, the AI Native Cloud, is proud to partner on this launch. Composer 2.5 is pushing the frontier for coding agents and turning heads for its speed and quality. Excited to
@frontier_foid @frontier_foid on x
its worth comparing Kimi K2.5 -> Kimi K2.6 and Kimi K2.5 -> Composer 2.5. can't tell on cursorbench but looks like a modest improvement over Kimi's posttraining effort on code, albeit a massive amount more compute. [image]
@benln Ben Lang on x
Composer 2.5 is now live For the next week, we're doubling the included usage
@eliebakouch Elie on x
cursor is at frontier scale, both in terms of performance and compute if composer 2.5's budget was put into a pre-train: ~6.3T total, 200B active trained on ~56T tokens if composer 3 allocates 50% of the budget to pre-training: ~500B active, 15.3T total trained on 135T tokens. [i…
@fireworksai_hq @fireworksai_hq on x
The @cursor_ai team shipped Composer 2 and now Composer 2.5 on the same Kimi K2.5 base model. Performance benchmarks are📈. Frontier quality and open-source economics. 85% of the compute powering these gains came from RL. Fireworks powers the RL rollouts. Learn more about [image]
@mweinbach Max Weinbach on x
Composer 2.5 is very good It's good at doing more than just quick iterations of front-end now I will probably use it over Claude in Cursor
@sualehasif996 Sualeh Asif on x
We've gotten really really good at RL. Composer 2.5 is fighting well-above its weight class. Very excited for the next release as we scale model sizes and FLOPs with @SpaceXAI!
@elonmusk Elon Musk on x
Try it out! (Partially trained on Colossus 2)
@michaelnicollsx Michael Nicolls on x
Congrats to the @cursor_ai team on the release of C2.5!
@kvfrans Kevin Frans on x
We used a pretty cool “RL with text feedback” formulation to train this one (see blog post for some details). As RL tasks get longer in horizon, I think it's a ripe time to think about ways we can extract signals that avoid the variance explosion.
@cursor_ai @cursor_ai on x
Composer 2.5 is exceptionally intelligent and up to 10x more efficient than similarly capable models. [image]
@scaling01 @scaling01 on x
yeah that's pretty good xAI might be able to cook with Cursor data + 10T model [image]
@clementdelangue Clem on x
Very cool to see Cursor doubling down on training great models. In my opinion, ultimately all serious companies in AI will want to train models themselves, based on open-source instead of outsourcing AI to others via APIs!
@muennighoff Niklas Muennighoff on x
Composer 2.5 sits on the Pareto frontier [image]
@mntruell Michael Truell on x
Composer 2.5 is a significant step up from Composer 2. This is the very start of our work with SpaceXAI. Hope to have more improvements out soon.
@sjwhitmore Sam Whitmore on x
composer 2.5 is really really great. I had it on last week for some testing, forgot that it was on, & totally didn't realize I wasn't on gpt 5.5 (my usual) for a while. the team did a fantastic job!!
@cursor_ai @cursor_ai on x
Composer 2.5 is built on the same open-source base as Composer 2, Moonshot's Kimi K2.5. [image]
@emostaque Emad on x
This is such an interesting chart layout, like it a lot! Congrats to @cursor_ai team on the 2.5 launch 🚀
@eliebakouch Elie on x
when you do continual pre training at this scale on traces that look like RL rollouts, does it hurt RL if the mid training data is very similar to what you RL on? what if it's the same data but with different rollouts from another model? intuitively i'd say yes, same intuition [i…

Chronicles