Anthropic says it will make Claude Mythos Preview available to 40+ organizations that maintain critical software and doesn't plan to make it generally available

Anthropic on Tuesday released a preview of its new frontier model, Mythos, which it says will be used by a small coterie of partner organizations for cybersecurity work.

TechCrunch 2026-04-08 Lucas Ropek

Discussion

@david_kasten Dave Kasten on x
The era of a rapidly-widening gap between public and private capabilities that we've expected is now here
@joshkale Josh Kale on x
This is big... Anthropic just announced a model so powerful they won't release it to the public out of fear over the damage it will cause 😨 Claude Mythos Preview found thousands of zero-day exploits in every major operating system and web browser... The numbers are hard to [video…
@anthropicai @anthropicai on x
We do not plan to make Mythos Preview generally available. Our goal is to deploy Mythos-class models safely at scale, but first we need safeguards that reliably block their most dangerous outputs. We'll begin testing those safeguards with an upcoming Claude Opus model.
@charlesd353 Charles on x
Interesting - I wonder how long they'll be able to hold this line if OpenAI's Spud is of similar calibre.
@jayair Jay on x
So the rumours were true They've got a new model that won't be generally available
@zackkorman Zack Korman on x
Anthropic is going to compete directly with cybersecurity companies.
@humanharlan Harlan Stewart on x
Anthropic is trying to prevent its powerful new AI from being used in dangerous ways, but the most dangerous use (by a wide margin) is the one Anthropic itself has planned. The planned use—and why they made it to begin with—is to accelerate the creation of superhumanly powerful
@ryanpgreenblatt Ryan Greenblatt on x
I tenatively believe it would be good if all AI companies had a policy of doing external deployment before internal deployment, because the largest risks are from internal deployment and external deployment improves visibility. Large internal/external gaps seem dangerous. 1/
@aisafetymemes @aisafetymemes on x
“This is very bad news.” What happened: >Anthropic relies on reading Claude's private thoughts >Claude learned its private thoughts were being graded >TLDR: THE SAFETY TESTING WAS BULLSHIT AND WE CAN'T TRUST ANYTHING CLAUDE SAYS ANYMORE. Basically, Anthropic claims Claude [image]
@thezvi Zvi Mowshowitz on x
This is very bad news. Anthropic (presumably) not noticing the severity of the issue is worse news. If Anthropic pretends this is not as bad as it is even after this is pointed out, it is far worse news than that.
@thezvi Zvi Mowshowitz on x
They accidentally trained against the CoT for Opus 4.6, Sonnet 4.6 and Mythos for 8% of RL. So let me be clear, at a minimum: ANY AND ALL REASSURING EVIDENCE FROM THEIR CoTs IS WORTHLESS. They are hopelessly corrupted. Good day, sir.
@tim_hua_ Tim Hua on x
Anthropic accidentally trained against the chain of thought in Claude Mythos, Opus 4.6, and Sonnet 4.6 [image]
@anthropicai @anthropicai on x
We're committing up to $100M in Mythos Preview usage credits for our partners and over 40 other organizations that maintain critical software, including open-source projects. Anthropic will report back what we learn.

Chronicles

Anthropic says it will make Claude Mythos Preview available to 40+ organizations that maintain critical software and doesn't plan to make it generally available

Related Coverage

Discussion