Anthropic updates its Responsible Scaling Policy, setting benchmarks for when an AI model's abilities reach a point where additional safeguards are necessary
Anthropic, the artificial intelligence company behind the popular Claude chatbot, today announced a sweeping update …
VentureBeat Michael Nuñez
Related Coverage
- Announcing our updated Responsible Scaling Policy Anthropic
- Anthropic Updates its Responsible Scaling Policy to Counter AI Risks WinBuzzer · Markus Kasanmascheff
- Anthropic updates policy to address AI risks Silicon Republic · Suhasini Srinivasaragavan
- Anthropic makes an AI safety plan Platformer · Casey Newton
- Announcing Our Updated Responsible Scaling Policy Hacker News
Discussion
-
@crumbler
Casey Newton
on threads
I wrote about Anthropic's new and improved plan to stop AI from doing harm and wondered what might happen if other companies took a similarly proactive approach to identifying risks https://www.platformer.news/ ...
-
@cfgeek
Charles Foster
on x
Fun fact, while Anthropic's “AI Safety Levels (ASLs)” are inspired by the 4 biosafety levels, there is no limit or inherent meaning to ASLs. Unlike BSL-4, “ASL-4” isn't necessarily the top level or even particularly strict safety/security, it just means “the level above ASL-3”.
-
@david_kasten
Dave Kasten
on x
I'm still reading this, and more broadly am still somewhat uncertain whether I think RSPs are actually conceptually feasible at higher levels of intelligence, but I do like that they are explicitly logging even minor deviations like “our eval took 3 days longer than the policy”
-
@kimmonismus
@kimmonismus
on x
Instead of Opus 3.5 we get a safety post. Have they learned from OpenAI? Come on guys, you know what we want.
-
@michael05156007
Michael Cohen
on x
Looks like ASL-4 measures are still a to-do. My hypothesis is they can't come up with any satisfactory measures without fundamentally reworking their approach. Every year that they fail to even write down a scheme for handling ASL-4 responsibly, this hypothesis looms larger.
-
@adonis_singh
Adi
on x
most people shit on anthropic, but this makes me even more excited for opus 3.5 it means it's going to be a smart model, so smart in fact, they had to do this..
-
@jasondclinton
Jason D. Clinton
on x
I'm excited to share something that we've been working on for the past few months. Big updates to the RSP and more clarity on the security controls in: https://www.anthropic.com/... . We think of the RSP as a prototype for regulation and it holds ourselves accountable to the publ…
-
r/ClaudeAI
r
on reddit
Anthropic Announces updated Responsible Scaling Policy