Mythos Blocked, GPT-5.5 Matches It: AI Cyberattack …

2. That's how many AI models can now plan and execute a multi-step cyberattack. The White House blocked commercial access to the first one today. The second was announced today.

The First

On April 8, Anthropic released Mythos Preview to roughly 40 organizations — critical software operators, a handful of security firms, government contractors selected for Project Glasswing. The company explicitly declined to make it generally available. The UK AI Security Institute had evaluated it and reached a specific conclusion: Mythos Preview was the first AI model to solve a multi-step cyberattack simulation. Not a demonstration. Not a benchmark approximation. A simulation of the kind of sequenced, adaptive exploit chain that security researchers had previously considered beyond current AI capability.

The reaction was immediate. The New York Times called Anthropic's restraint "a terrifying warning sign" and reported that tech executives had privately briefed Trump administration officials on the national security implications. The model was restricted. Access was tightly controlled. The number stayed at 40.

It didn't stay at 1.

The Restriction

Anthropic proposed expanding Mythos to approximately 70 additional organizations. The Wall Street Journal reported today that the White House opposes the plan on security grounds. Zero of the 70 were approved.

The same day, the AI Security Institute published its evaluation of OpenAI's GPT-5.5. The finding: GPT-5.5 "reaches a similar level of performance as Mythos Preview and is the second model to solve a multi-step cyberattack simulation."

The numbers as of today: 40 organizations with approved Mythos access. 70 more, blocked. 2 models at the cyberattack threshold.

The Containment Window

The time between Mythos Preview's release and GPT-5.5's matching evaluation is approximately three weeks. This is not the time it took OpenAI to copy Anthropic — both models were developed independently, on parallel timelines. It is the time between one frontier lab reaching a capability threshold and the next lab reaching the same one.

The policy implication is structural. A containment approach that restricts one model's distribution operates on the assumption that the underlying capability is singular — that there is one dangerous model, and if you control access to it, you control the risk. But frontier AI capabilities don't work that way. The capability isn't in the model; it's in the training approach, the scale, the data. When one lab demonstrates a capability at frontier scale, the others have typically been building toward the same thing.

This is not a new observation. It is usually made in the abstract. Today it arrived as a data point: the White House enacted its first major AI model containment action on the same day it became demonstrably insufficient.

The Access Map

The access picture is already fragmented. The NSA is using Mythos Preview — despite the Pentagon's February designation of Anthropic as a supply-chain risk. CISA, the government's primary cybersecurity agency, does not have approved access. Unauthorized users on Discord have had access since April 8, the day the model was announced. The restriction produced a specific outcome: the agency responsible for civilian cyber defense was excluded; the intelligence community moved in without official sanction; commercial users were blocked; a private Discord channel was not.

OpenAI is restricting its GPT-5.5 security variant to "critical cyber defenders" — a similar approach to Anthropic's. Two companies, two models, the same capability, the same restricted access posture. The number of organizations that can officially access either: approximately 40. The number of models at the threshold: 2, and counting.

The Baseline

In April 2026, there was one AI model capable of a multi-step cyberattack simulation. The government response was to restrict its expansion. The baseline was 1.

By April 30, there were 2. The containment policy hadn't changed. The baseline had.

2 is not a large number. It is larger than 1. The distance between them — three weeks, one model generation, one evaluation cycle — is the containment window. It is the time available to build a policy calibrated to a specific capability before a second model independently validates that the capability is no longer singular. Three weeks is a short window.