AI Safety (Company)

TechCrunch 23 related

Anthropic hires former OpenAI safety lead Jan Leike to head up a new Superalignment team; a source says Leike will report to Chief Science Officer Jared Kaplan

Here's What We Know Wendy Lee / Los Angeles Times : OpenAI forms safety and security committee as concerns mount about AI Rounak Jain / Benzinga : OpenAI Former ‘Superalignment’ Lead Joins Jeff Bezos-...

2024-05-29 View

TechCrunch 13 related

Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors

[images] Abraham Samma / @abesamma@toolsforthought.social : Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training — This is some sci-fi stuff right here (even if unsurprising)...

2024-01-15 View

Bloomberg 11 related

OpenAI says its board can hold back the release of an AI model even if OpenAI's leadership says it's safe, and announces a new internal safety advisory group

The study of frontier AI risks has fallen far short of what is possible and where we need to be. Ina Fried / Axios : OpenAI touts ‘scientific approach’ to measure catastrophic risk Matthias Bastian / ...

2023-12-19 View

Emily M. Bender 5 related

Framing AI debates as a schism between people worried about AI going rogue and those illuminating actual harms is ahistorical and obscures important research

In two recent conversations with very thoughtful journalists, I was asked about the apparent ‘schism’ between those making a lot … Bluesky: @abeba.bsky.social , @mmitchell.bsky.social , and @emilymben...

2023-07-06 View

Bloomberg

How Silicon Valley became obsessed with effective altruism, championed by SBF before he dismissed it as a dodge, and doomsday scenarios like killer rogue AI

Sonia Joseph was 14 years old when she first read Harry Potter and the Methods of Rationality, a mega-popular piece of fan fiction … Tweets: @chafkin , @ellenhuet , @business , @can , @crypto , @sonia...

2023-03-08 View

AI Safety

Top Voices

Explore Further

Coverage Timeline

Anthropic hires former OpenAI safety lead Jan Leike to head up a new Superalignment team; a source says Leike will report to Chief Science Officer Jared Kaplan

Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors

OpenAI says its board can hold back the release of an AI model even if OpenAI's leadership says it's safe, and announces a new internal safety advisory group

Framing AI debates as a schism between people worried about AI going rogue and those illuminating actual harms is ahistorical and obscures important research

How Silicon Valley became obsessed with effective altruism, championed by SBF before he dismissed it as a dodge, and doomsday scenarios like killer rogue AI

Quarterly Coverage

Top Sources

Narrative

Relationships