Anthropic says Mythos Preview achieves 93.9% on SWE-bench Verified, compared with 80.8% for Opus 4.6, and 77.8% on SWE-bench Pro, versus 53.4% for Opus 4.6

Michael Nuñez /VentureBeat:NEW

VentureBeat 2026-04-07 Michael Nuñez

Discussion

@apompliano Anthony Pompliano on x
AI is coming for a lot of jobs. Just look at these performance metrics from Anthropic's latest model. Superhuman intelligence is going to be available to anyone. [image]
@deedydas Deedy on x
Claude Mythos just obliterated every single benchmark in AI. I can't believe what I'm reading. [image]
@fabknowledge @fabknowledge on x
wow this is the biggest step change in a new model release in recent memory [image]
@fabknowledge @fabknowledge on x
Mythos able to exploit like firefox pretty easily. Cybench is 100% at 1 pass which is lol [image]
@neilhtennek Kenneth on x
I cannot celebrate Mythos, it brings a sense of dread I do not particularly understand. 93.9% SWE-Bench. [image]
@kimmonismus @kimmonismus on x
MYTHOS BENCHMARKS, OFFICIAL. HOLY MOLY Anthropic cooked!! [image]
@yuchenj_uw Yuchen Jin on x
After seeing the Mythos benchmark scores, my Claude Opus 4.6 already feels outdated. Anthropic, can you just drop Mythos? I know you can't do it due to some “safety” reasons, but I'd happily pay $2,000/month to use it. AGI is already here - it's just not evenly distributed.
@yuchenj_uw Yuchen Jin on x
Anthropic is truly unstoppable. Mythos is crushing Claude Opus 4.6 across every serious agentic coding benchmark. It has found vulnerabilities in the Linux kernel, a 27-year-old vulnerability in OpenBSD, and a 16-year-old vulnerability in FFmpeg. No wonder folks at big labs [imag…
r/technology r on reddit
Anthropic says its most powerful AI cyber model is too dangerous to release publicly — so it built Project Glasswing

Chronicles

Anthropic says Mythos Preview achieves 93.9% on SWE-bench Verified, compared with 80.8% for Opus 4.6, and 77.8% on SWE-bench Pro, versus 53.4% for Opus 4.6

Related Coverage

Discussion