ai, research,

responsible-scaling-policy-v3

Cui Cui Follow Feb 26, 2026 · 4 mins read
responsible-scaling-policy-v3

Policy Announcements Anthropic’s Responsible Scaling Policy: Version 3.0 Feb 24, 2026 Read the Responsible Scaling Policy We’re releasing the third version of our Responsible Scaling Policy (RSP), the voluntary framework we use to mitigate catastrophic risks from AI systems. Anthropic has now had an

Executive Summary

Policy Announcements Anthropic’s Responsible Scaling Policy: Version 3.0 Feb 24, 2026 Read the Responsible Scaling Policy We’re releasing the third version of our Responsible Scaling Policy (RSP), the voluntary framework we use to mitigate catastrophic risks from AI systems. Anthropic has now had an RSP for more than two years, and we’ve learned a great deal about its benefits and its shortcomings. We’re therefore updating the policy to reinforce what has worked well to date, improve the policy

Key Insights

  • Policy Announcements Anthropic’s Responsible Scaling Policy: Version 3.0 Feb 24, 2026 Read the Responsible Scaling Policy We’re releasing the third version of our Responsible Scaling Policy (RSP), the voluntary framework we use to mitigate catastrophic risks from AI systems. Anthropic has now had an RSP for more than two years, and we’ve learned a great deal about its benefits and its shortcomings. We’re therefore updating the policy to reinforce what has worked well to date, improve the policy where necessary, and implement new measures to increase the transparency and accountability of our decision-making. You can read the new RSP in full here . In this post, we’ll discuss some of the thinking behind the changes. The original RSP and our theory of change The RSP is our attempt to solve the problem of how to address AI risks that are not present at the time the policy is written, but which could emerge rapidly as a result of an exponentially advancing technology. When we wrote the original RSP in September 2023, large language models were essentially chat interfaces. Today they can browse the web, write and run code, use computers, and take autonomous, multi-step actions. As each of these new capabilities have emerged, so have new risks. We expect this pattern to continue. We focused the RSP on the principle of conditional , or if-then , commitments. If a model exceeded certain capability levels (for example, biological science capabilities that could assist in the creation of dangerous weapons), then the policy stated that we should introduce a new and stricter set of safeguards (for example, against model misuse and the theft of model weights). Each set of safeguards corresponded to an “AI Safety Level” (ASL): for example, ASL-2 referred to one set of required safeguards, whereas ASL-3 referred to a more stringent set of safeguards needed for more capable AI models. Early ASLs (ASL-2 and ASL-3) were defined in significant detail, but it was more difficult to spec

Technical Deep Dive

Policy Announcements Anthropic’s Responsible Scaling Policy: Version 3.0 Feb 24, 2026 Read the Responsible Scaling Policy We’re releasing the third version of our Responsible Scaling Policy (RSP), the voluntary framework we use to mitigate catastrophic risks from AI systems. Anthropic has now had an RSP for more than two years, and we’ve learned a great deal about its benefits and its shortcomings. We’re therefore updating the policy to reinforce what has worked well to date, improve the policy where necessary, and implement new measures to increase the transparency and accountability of our decision-making. You can read the new RSP in full here . In this post, we’ll discuss some of the thinking behind the changes. The original RSP and our theory of change The RSP is our attempt to solve the problem of how to address AI risks that are not present at the time the policy is written, but which could emerge rapidly as a result of an exponentially advancing technology. When we wrote the original RSP in September 2023, large language models were essentially chat interfaces. Today they can browse the web, write and run code, use computers, and take autonomous, multi-step actions. As each of these new capabilities have emerged, so have new risks. We expect this pattern to continue. We focused the RSP on the principle of conditional , or if-then , commitments. If a model exceeded certain capability levels (for example, biological science capabilities that could assist in the creation of dangerous weapons), then the policy stated that we should introduce a new and stricter set of safeguards (for example, against model misuse and the theft of model weights). Each set of safeguards corresponded to an “AI Safety Level” (ASL): for example, ASL-2 referred to one set of required safeguards, whereas ASL-3 referred to a more stringent set of safeguards needed for more capable AI models. Early ASLs (ASL-2 and ASL-3) were defined in significant detail, but it was more difficult to spec

Why This Matters

This article from Anthropic’s News team shares valuable insights into cutting-edge AI development, engineering best practices, and the future of AI systems. Essential reading for AI engineers and researchers.


This post was automatically curated from Anthropic. Published on 2026-02-26T17:00:53.117Z.

Join Newsletter
Get the latest news right in your inbox. We never spam!
Cui
Written by Cui Follow
Hi, I am Z, the coder for cuizhanming.com!

Click to load Disqus comments