Anthropic rolls out a fast mode for Claude Opus 4.6 in research preview, saying it offers the same model quality 2.5 times faster but costs six times more

Opus is usually $5/million input and $25/million output. The new fast mode is $30/million input and $150/million output!

Simon Willison's Weblog 2026-02-08 Simon Willison

Context & Ripple Effects

Anthropic had already positioned Opus as its premium model line while cutting Opus 4.5 API pricing to the same $5/$25 input-output baseline used here in its earlier Opus pricing reset. A separate report on Opus 4.6 emphasized deeper focus on difficult tasks, making latency a distinct dimension on which to sell the same model capability.

First-order effects

Developers with time-sensitive workloads can pay for a faster Opus 4.6 path without changing models, while standard-mode users retain the lower-priced option.
Anthropic turns delivery speed into an explicit premium SKU: the claimed 2.5x latency gain carries a sixfold increase in token prices.

Second-order effects

Customers will have to evaluate speed against total task economics, likely reserving the fast tier for workflows where waiting time is more costly than additional API spend.
Other frontier-model providers face a clearer market test for latency-priced tiers, rather than competing only on model quality and per-token rates.

Third-order effects

If customers adopt such tiers, frontier-model APIs may increasingly segment by service level—speed, reliability, and quality—rather than treating each model as a single commodity offering.
The durable buying metric shifts toward cost per completed useful task: a higher token price can be rational only where lower latency materially improves the surrounding workflow.

The trend: Frontier AI providers are moving toward capacity-aware, multi-tier inference pricing that monetizes low-latency access separately from model intelligence.

Discussion

@claudeai Claude on x
Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We're now making it available as an early experiment via Claude Code and our API.
@udiwertheimer Udi Wertheimer on x
i much prefer codex over claude code as a coding agent BUT opus 4.6 is so good for just chatting. way better than the others. smarter, more creative, feel more natural. has really good ideas and analysis
@edzitron Ed Zitron on x
Gotta wonder if the plan isn't for this to eventually be considered the normal speed and also the normal price (6x)
@zoink Dylan Field on x
Opus 4.6 (fast mode) is... really fast! I was very impressed by the speed and quality. Reply to the Figma post below with something awesome you've created in Figma Make and we'll DM you for special access this weekend!
@claudeai Claude on x
@cursor_ai ... We plan to expand API access to more customers. You can join the waitlist here: https://claude.com/...
@adocomplete Ado on x
Calling all builders! Opus 4.6 is here. We want to see what you can create with it. → $100k prize pool → $500 in API credits to build → Hack Feb 10 - Feb 16, party in SF to celebrate Feb 21 I'm honored to be one of the judges and can't wait to see what you'll build. [video]
@github @github on x
We've been testing this all morning and we're loving the speed ⚡️ Copilot Pro+ users now have access to fast mode for Claude Opus 4.6 in research preview. Let us know what you think 👀
@lovable @lovable on x
Lovable now supports Claude Opus 4.6 with fast mode (research preview) for select tasks. 2.5x faster code generation with the same Opus-level intelligence.
@_catwu Cat on x
We granted all current Claude Pro and Max users $50 in free extra usage. This credit can be used on fast mode for Opus 4.6 in Claude Code. To use, claim the credit and toggle on extra usage on https://claude.ai/.... Then, run ‘claude update && claude’ and ‘/fast’. Enjoy!
@dylan522p Dylan Patel on x
Open models show 2.5x faster, 6x more expensive Lower batch size, speculative decoding harder Pareto optimal curve for Deepseek at https://inferencemax.ai/ shows this Claude Opus 4.6 is 100 Tok/s/user Deepseek at 100 is 6k Tok/s/GPU At 250 tok/s/user it's closer to 1k [image]
@windsurf @windsurf on x
Opus 4.6 (fast mode) is now available in Windsurf in research preview! It's just as smart as regular Opus 4.6 but runs up to 2.5x faster. Users will have access to promo pricing until Feb 16. Let us know what you think. [image]
@cerebras @cerebras on x
Fast inference = 6x markup? Don't be giving us any ideas 😼
@bcherny Boris Cherny on x
We just launched an experimental new fast mode for Opus 4.6. The team has been building with it for the last few weeks. It's been a huge unlock for me personally, especially when going back and forth with Claude on a tricky problem.
@jeffwsurf Jeff Wang on x
Opus 4.6 Fast is the fastest SOTA model we've tried in Windsurf yet. It's able to reason to reason through complex problems at the same level as Opus 4.6 while achieving up to 2.5x faster output token speeds
@alexalbert__ Alex Albert on x
This has one been one of my biggest productivity boosts of the past year. Highly recommend trying this out, in some ways it feels just as impactful as a model intelligence upgrade.
@yuchenj_uw Yuchen Jin on x
2.5x faster but 6x more expensive. This can't be achieved by inference optimization, must be new chips. TPU? B200? AWS Inferentia? Cerebras?
@cursor_ai @cursor_ai on x
It's priced at $30 input / $150 output tokens. For the next 10 days, it's available for 50% off.
@claudeai Claude on x
Fast mode is available now for Claude Code users with extra usage enabled (use /fast). It's also available in research preview on @cursor_ai, @emergentlabs, @FactoryAI, @figma, @github Copilot, @Lovable, @v0, and @windsurf.
@zephyr_z9 @zephyr_z9 on x
2.5x speedup and 6x price increase Small batch size Total throughput of the system takes a giant hit on increasing tokens/user/sec
@cursor_ai @cursor_ai on x
Opus 4.6 (fast mode) is now available in Cursor! It's 2.5x as fast in research preview.
@github @github on x
🏎️ Fast mode for @AnthropicAI's Claude Opus 4.6 is rolling out in research preview on GitHub Copilot. Get 2.5x faster token speeds with the same frontier intelligence—now at promotional price of 9 premium requests through Feb 16. This release is early and experimental. Try it
@deanwball Dean W. Ball on x
I would like to know more about the experimental Claude scaffold that caused Opus 4.6 to more than double its performance in optimizing GPU kernels over the standard scaffold [image]
@deanwball Dean W. Ball on x
This is true of the Claude app as well. All other aspects of my machine function normally; Claude Code is a resource hog but not machine-stoppingly so. there is something wrong with these GUI apps and it seems not insane to wonder if the vibe-coding helps explain it.
@mikeyk Mike Krieger on x
I've spent all my time after switching to Labs building with fast Opus and it's a crazy unlock — excited that we're making it available outside Anthropic too.
@bcherny Boris Cherny on x
Use /fast to enable. It uses a lot more compute than Opus 4.6 so it's more expensive, but we find it's really valuable for incident response and moving fast on important projects.
@figma @figma on x
Anthropic's research preview for Claude Opus 4.6 (Fast mode) is here, and it's in Figma Make for a limited time (for free and 2.5x the speed) Show us what you've made in Make and we'll DM you for special access through the weekend [image]
r/ClaudeAI r on reddit
Opus 4.6: Fast-Mode
@prietschka Paul Rietschka on bluesky
I don't believe a word of this. Unless/until Anthropic allows a set of reputable forensic accountants in to look at the books I'll treat everything Wario et sa soeur say as lies. [embedded post]
@buccocapital @buccocapital on x
Ominous words from the legendary @Steve_Yegge Well worth the full read [image]
@yacinemtb Kache on x
@teortaxesTex Seems like armodei figured out how to motivate people who retire after one paycheck to keep on working. Jihad
@buccocapital @buccocapital on x
For 50 yrs we treated the supremacy of asset-light businesses as a permanent economic law But if AI commoditizes asset-light businesses, we'd just be reverting to the historical mean where value accrued to atoms, infrastructure, energy It would be a 50 year blip. An anomaly
@teortaxestex @teortaxestex on x
me: Claude is a labradoodle AI, overhyped meanwhile Anthropic: [image]
@druce.ai @druce.ai on bluesky
Anthropic operates a high-velocity “hive mind” with ≈90-day planning cycles, versioned workstreams, and claimed 10×-1000× developer productivity, enabling rapid productization such as Claude Cowork, launched publicly about 10 days after conception.
@johnspurlock.com John Spurlock on bluesky
‘So now you see how the magic starts and ends. During Golden Ages, there is more work than people. And when they crash, it is because there are more people than work.’ — steve-yegge.medium.com/the-anthropi...
@carnage4life Dare Obasanjo on bluesky
Steve Yegge writes about Anthropic's culture. Nothing concrete, mostly vibes. — He says it has that early Google/Amazon lightning in a bottle energy. — He notes Google lost that energy when Larry Page pivoted to focus on profits, killed 20% time and suddenly had too many peo…
@alexpalcuie @alexpalcuie on x
launched opus 4.6 on thursday, shipping fast mode on saturday we called it fast because that's how long it took to ship after the last one

Chronicles