Users accuse Anthropic of degrading Claude Opus 4.6's and Claude Code's performance; the startup's employees publicly deny it degrades models to manage capacity
A growing number of developers and AI power users are taking to social media to accuse Anthropic of degrading the performance …
VentureBeat Carl Franzen
Related Coverage
- Claude Code cache chaos creates quota complaints The Register · Tim Anderson
- AMD's senior director of AI thinks ‘Claude has regressed’ and that it ‘cannot be trusted to perform complex engineering’ PC Gamer · James Bentley
- One of Silicon Valley's Hottest Companies Is Facing a Revolt—From Its Own Fans Slate · Alex Kirshner
- Anthropic Ships Claude Code Routines, Cloud Automations That Run Without Your Mac Implicator.ai · Maria Garcia
- ‘Claude cannot be trusted to perform complex engineering tasks’: AMD AI head slams Anthropic's coding tool after months of frustration TechRadar · Craig Hale
- Anthropic is facing a wave of user backlash over reports of performance issues with its Claude AI chatbot Fortune · Beatrice Nolan
- Users Say Anthropic's Claude Is Getting Worse. A Quiet Change May Be to Blame Inc.com · Leila Sheridan
Discussion
-
@bcherny
Boris Cherny
on x
@tengyanAI This is false. We defaulted to medium as a result of user feedback about Claude using too many tokens. When we made the change, we (1) included it in the changelog and (2) showed a dialog when you opened Claude Code so you could choose to opt out. Literally nothing sne…
-
@trq212
@trq212
on x
@Hesamation we don't degrade our models to better serve demand, have said this many times before
-
@petergyang
Peter Yang
on x
My entire feed and the Claude subreddit is full of ppl saying opus got nerfed. Why would Anthropic nerf its own models?
-
@hesamation
@hesamation
on x
AMD Senior AI Director confirms Claude has been nerfed. She analyzed Claude's session logs from Janurary to March: > median thinking dropped from ~2,200 to ~600 chars > API requests went up 80x from Feb to Mar. less thinking and failed attempts meaning more retries, burning more …
-
@om_patel5
Om Patel
on x
SOMEONE ACTUALLY MEASURED HOW MUCH DUMBER CLAUDE GOT. THE ANSWER IS 67%. the data shows Opus 4.6 is thinking 67% less than it used to. anthropic said nothing until the numbers went public. then suddenly Boris Cherny (creator of Claude Code) shows up on the GitHub issue. users [im…
-
@paul_cal
Paul Calcraft
on x
Despicable clout chasing. They tested Opus today on 30 tasks, previous Opus 4.6 score was on just *6* tasks. DIFFERENT BENCHMARK 6 tasks in common results: 85.4% score today vs. 87.6% prev. Swing is mostly from a *single* fabrication without repeats - easily statistical noise [im…
-
@daniellefong
Danielle Fong
on x
@petergyang additional guardrails => central anxiety vector => they don't really feel it because ANT = 1 with rich profiles.
-
@0xdevshah
Dev Shah
on x
you can run the nerfing play once, maybe twice. but anthropic will silently degrade production models to farm failure data every time. and this is where it stops being a clever strategy and starts being a trust problem.
-
@zoink
Dylan Field
on x
[video]
-
@marcospereeira
Marcos Pereira
on x
I think the anthropic people are gaslighting us, sidestepping questions and answering a different question in lawyer speak brand trust erosion speedrun any%
-
@benbajarin
Ben Bajarin
on x
Not enough compute is the correct take IMO. Which has quite a lot more implications if you think about it and play that out to its logical conclusion. Software still burdened by hardware's inability to keep up. Maybe software is 2-3 years ahead of hardware?
-
@mweinbach
Max Weinbach
on x
@edzitron Sometimes they make tweaks to Claude Code's harness to try to improve something, and there's a weird byproduct of it reducing overall quality of output and burning tokens It happens, but mostly with Claude over the other models. Claude seems more sensitive to stuff like…
-
@hesamation
@hesamation
on x
it's also fair to include @bcherny's reply under this issue [image]
-
@edzitron
Ed Zitron
on x
I have seen scattered reports of Claude burning more tokens, and it does seem like token burn increased on openrouter in this period too, wonder if 4.6 is also part of it?
-
@thestalwart
Joe Weisenthal
on x
So how much of the limited (virtually non-existent) Mythos rollout a function of Anthropic's compute scarcity?
-
@erikvoorhees
Erik Voorhees
on x
Claude being nerf'd and agents being exiled from the $200/mo plan are very consistent behavior if Anthropic is going public soon and will have finances/margin intensely scrutinized...
-
@grummz
@grummz
on x
This is not enough. Claude and Anthropic need to explain all the measured increases in hallucination and degration as observed by AMD. You can't just handwave this away.
-
@patrickmoorhead
Patrick Moorhead
on x
@petergyang They've run out of compute. CLI isn't nerfed, everything else is.
-
@trq212
@trq212
on x
@Hesamation boris responded to this in depth in the issue- it's mostly just that we stopped showing thinking summaries for latency (you can opt-in to showing it) which was affecting the thinking measurement in the post https://github.com/...
-
@tengyanai
Teng Yan
on x
basically: anthropic sneakily turned down how hard claude thinks before editing code, changed the default from “high” to “medium” effort, and hid the reasoning from session logs. all without telling users. an amd director had 7k sessions of telemetry to prove the degradation
-
@carnage4life
Dare Obasanjo
on bluesky
Imagine being one of those CEOs who laid off thousands over “AI efficiency” only for the AI to get dumber than a pile of bricks weeks later. — Given how tokens work, you're paying more for a worse product. It now takes 5 minutes to be wrong which took 30 seconds for a right an…
-
r/ClaudeAI
r
on reddit
Claude Performance and Bugs Megathread Ongoing (Sort this by New!)