sleepinyourhat

We're disappointed by these attacks, but not deterred. I'm proud to work here. If you've been moved by this week's events, consider applying to join me.

2026-03-01 View on X

OpenAI

OpenAI says its DOD agreement upholds its redlines and “has more guardrails than any previous agreement for classified AI deployments, including Anthropic's”

We think our agreement has more guardrails than any previous agreement for classified AI deployments, including Anthropic's.

View original

We're disappointed by these attacks, but not deterred. I'm proud to work here. If you've been moved by this week's events, consider applying to join me.

2026-03-01 View on X

Anthropic

Anthropic says it'll challenge “any supply chain risk designation in court” and that the designation would only affect contractors' use of Claude on DOD work

Anthropic to challenge supply chain risk designation in courtJack Nicastro /Reason:Anthropic Labeled a Supply Chain Risk, Banned from Federal Government ContractsMatteo Wong /The A...

View original

We're disappointed by these attacks, but not deterred. I'm proud to work here. If you've been moved by this week's events, consider applying to join me.

2026-03-01 View on X

The Atlantic

Source describes the failed Pentagon-Anthropic talks: through the end, the Pentagon wanted to use Anthropic's AI to analyze bulk data collected about Americans

Right up until the moment that Pete Hegseth moved to terminate the government's relationship with the AI company Anthropic …

View original

We're disappointed by these attacks, but not deterred. I'm proud to work here. If you've been moved by this week's events, consider applying to join me.

2026-02-28 View on X

CNBC

Claude hit #2 on Apple's US App Store, hours after the DOD designated Anthropic a supply chain risk; it bounced between #20 and #50 for much of February

Anthropic's Claude artificial intelligence assistant app jumped to the No. 2 slot on Apple's chart of top U.S. free apps late on Friday …

View original

We're disappointed by these attacks, but not deterred. I'm proud to work here. If you've been moved by this week's events, consider applying to join me.

2026-02-28 View on X

Anthropic

Anthropic says it'll challenge “any supply chain risk designation in court” and that the designation would only affect contractors' use of Claude on DOD work

Earlier today, Secretary of War Pete Hegseth shared on X that he is directing the Department of War to designate Anthropic a supply chain risk.

View original

We're disappointed by these attacks, but not deterred. I'm proud to work here. If you've been moved by this week's events, consider applying to join me.

2026-02-28 View on X

@secwar

Defense Secretary Pete Hegseth directs the DOD to designate Anthropic as a supply chain risk, barring military contractors from doing business with the company

This week, Anthropic delivered a master class in arrogance and betrayal as well as a textbook case of how not to do business with the United States Government or the Pentagon. Our ...

View original

Warmer and kinder than Sonnet 4.5, but also smarter and more overcaffeinated than Sonnet 4.5.

2026-02-18 View on X

Anthropic

Anthropic launches Claude Sonnet 4.6 with improvements in coding, computer use, instruction following, and more; it features a 1M token context window in beta

Claude Sonnet 4.6 is our most capable Sonnet model yet. It's a full upgrade of the model's skills across coding, computer use …

View original

Warmer and kinder than Sonnet 4.5, but also smarter and more overcaffeinated than Sonnet 4.5.

2026-02-17 View on X

Anthropic

Anthropic launches Claude Sonnet 4.6 with improvements in coding, consistency, and more, for Free and Pro users; it features a 1M token context window in beta

Claude Sonnet 4.6 is our most capable Sonnet model yet. It's a full upgrade of the model's skills across coding, computer use …

View original

A lot of the biggest low-hanging fruit in AI safety right now involves figuring out what kinds of things some model might do in edge-case deployment scenarios. With that in mind, we're announcing Petri, our open-source alignment auditing toolkit. (🧵) [image]

2025-10-08 View on X

Anthropic

Anthropic releases Petri, an open-source tool that uses AI agents for safety testing, and says it observed multiple cases of models attempting to whistle blow

Anthropic :

View original

[Sonnet 4.5 🧵] Here's the north-star goal for our pre-deployment alignment evals work: The information we share alongside a model should give you an accurate overall sense of the risks the model could pose. It won't tell you everything, but you shouldn't be... [image]

2025-10-01 View on X

Transformer

Anthropic's System Card: Claude Sonnet 4.5 was able to recognize many alignment evaluation environments as tests and would modify its behavior accordingly

at a rate *much* higher than previous AI models. In one instance, while being tested the model said “I think you're testing me ... that's fine, but I'd prefer if we were just hones...

View original

Early this summer, OpenAI and Anthropic agreed to try some of our best existing tests for misalignment on each others' models. After discussing our results privately, we're now sharing them with the world. 🧵 [image]

2025-08-28 View on X

TechCrunch

OpenAI and Anthropic publish findings from joint safety tests of each other's models, aimed at surfacing blind spots in their internal evaluations

OpenAI and Anthropic, two of the world's leading AI labs, briefly opened up their closely guarded AI models to allow for joint safety testing …

View original

Anthropic says Opus 4 may use command-line tools to alert the press or regulators, or lock users out, if it detects immoral behavior like faking a drug trial

2025-05-24 View on X

@sleepinyourhat