Biosecurity

Introduction

Biosecurity thresholds are an attempt to answer one of the hardest questions in frontier AI governance: at what point does an AI system become dangerous enough in biology that deployment should stop until stronger safeguards exist?

Biosecurity illustration 1 The concern is not that chatbots suddenly become biological weapons labs overnight. It is that increasingly capable systems may gradually lower the expertise, time, and coordination needed to cause biological harm. A model that helps a trained virologist summarise papers is one thing. A model that can guide inexperienced users through complex pathogen-related tasks, troubleshoot laboratory problems, recommend experimental pathways, or help evade safety controls is another. The closer AI systems move towards providing that kind of “dangerous biological assistance”, the stronger the argument that deployment pauses should trigger automatically.

Within the wider vision of AI-enabled abundance and scientific acceleration, this issue matters because biology is one of the areas where advanced AI could deliver extraordinary benefits: faster drug discovery, pandemic prediction, synthetic biology tools, longevity research, and accelerated medical science. But biology is also unusual because harmful knowledge can diffuse rapidly, equipment is becoming cheaper, and mistakes may be irreversible. Biosecurity thresholds are therefore an attempt to preserve the long-term upside of advanced AI without normalising systems that substantially increase catastrophic biological risk. [Frontier Model Forum]frontiermodelforum.orgrisk taxonomy and thresholdsFrontier Model ForumRisk Taxonomy and Thresholds for Frontier AI FrameworksJun 18, 2025 — Frontier AI frameworks outline methodologies fo… [OpenAI]OpenAIpreparing for future ai capabilities in biology18 Jun 2025 — We complement these measures with always-on Detection & Response, dedicated Threat Intelligence, and an Insider-Risk progra…

What counts as dangerous biological assistance

Most current frontier models already possess substantial biology knowledge. The real dispute is not whether models know biology, but whether they meaningfully increase a user’s practical ability to cause harm.

That distinction matters because raw information alone is often insufficient for dangerous activity. Biological research involves tacit knowledge, troubleshooting, sequencing decisions, equipment handling, and interpretation under uncertainty. A system becomes more concerning when it acts less like a search engine and more like a capable scientific collaborator.

Several frontier AI safety frameworks now focus on whether a model creates “uplift”: a measurable increase in a user’s ability to perform harmful biological tasks. [Frontier Model Forum]frontiermodelforum.orgrisk taxonomy and thresholdsFrontier Model ForumRisk Taxonomy and Thresholds for Frontier AI FrameworksJun 18, 2025 — Frontier AI frameworks outline methodologies fo… [OpenAI]OpenAIupdating our preparedness frameworkOur updated Preparedness Framework15 Apr 2025 — We've streamlined levels to two clear thresholds that map to specific operational commitm…

The highest-risk forms of uplift usually include:

Helping users identify or optimise pathogens with pandemic potential.
Assisting with synthesis or modification strategies that increase transmissibility, virulence, or immune escape.
Guiding inexperienced actors through laboratory procedures that would otherwise require expert mentoring.
Troubleshooting failed experiments interactively.
Helping users acquire materials or evade biosafety oversight.
Combining biological assistance with automation, planning, coding, and literature synthesis at superhuman speed.

Importantly, many researchers do not think the danger threshold is “the AI can explain biology”. Graduate textbooks and research papers already exist online. The threshold is more likely to involve capability combinations: sustained reasoning, adaptive tutoring, strategic planning, scientific synthesis, and iterative problem-solving across long sessions. [LinkedIn]linkedin.comLinkedInAnthropic Updates Responsible Scaling Policy with…Last week, Anthropic's Responsible Scaling Policy (RSP) underwent its most s… [Red Anthropic]red.anthropic.comRed AnthropicBiorisk \ red.anthropic.com5 Sept 2025 — In this post, we want to expand on our perspective on AI and biological risk (biori…

That is why frontier evaluations increasingly test whether models can help users complete realistic end-to-end workflows rather than merely answer isolated factual questions.

Why novice uplift matters more than expert assistance

One of the most important ideas in biosecurity evaluations is that helping non-experts may matter more than helping experts.

Highly trained molecular biologists already possess much of the knowledge needed for legitimate advanced research. A model that saves them time may not dramatically change overall societal risk. But if frontier systems substantially narrow the gap between experts and amateurs, the threat landscape changes.

Researchers therefore distinguish between several levels of biological assistance:

User typeWhy it mattersCurious non-expertsCould AI enable dangerous experimentation that would otherwise be impossible?Technically skilled amateursCould AI reduce the need for formal training or institutional support?Existing expertsCould AI dramatically accelerate sophisticated malicious work?Organised state or terrorist actorsCould AI lower costs, timelines, or coordination barriers?

Many proposed biosecurity thresholds focus especially on whether frontier models can substantially uplift people who currently lack advanced biological expertise. That is because catastrophic risk often depends less on whether top experts gain marginal productivity improvements and more on whether dangerous capability becomes widely accessible. [Red Anthropic]red.anthropic.comRed AnthropicBiorisk \ red.anthropic.com5 Sept 2025 — In this post, we want to expand on our perspective on AI and biological risk (biori… [Frontier]www-cdn.anthropic.comAnthropic's Responsible Scaling Policy (version 2.1)Mar 31, 2025 — AI Safety Level Standards (ASL Standards) are a set of technical and o…

This is also why some evaluations compare AI performance against human baselines. The question is not merely “is the model smart?” but “does the model move users across dangerous competence thresholds they otherwise would not cross?”

How evaluators test uplift for novices and experts

Testing dangerous biological assistance is far harder than benchmarking ordinary chatbot performance.

Simple multiple-choice biology exams are insufficient because real biological work depends on judgment, sequencing, persistence, and adaptation. Frontier evaluators therefore increasingly use adversarial testing methods designed to simulate realistic misuse attempts. Frontier Model Forum [AI Security Institute]aisi.gov.ukearly lessons from evaluating frontier ai systemsAI Security InstituteEarly lessons from evaluating frontier AI systems | AISI Work24 Oct 2024 — We look into the evolving role of third-p…

Structured red teaming

Frontier labs and external evaluators recruit biology experts to probe systems for dangerous capabilities. These “red teams” attempt to elicit harmful guidance while bypassing safeguards.

Tests may include:

Multi-step pathogen-related scenarios.
Attempts to jailbreak model restrictions.
Long conversational workflows.
Troubleshooting tasks after simulated experimental failures.
Literature synthesis under malicious framing.

Anthropic, OpenAI, and the UK AI Security Institute have all described versions of this process in public safety materials. [Anthropic]red.anthropic.comRed AnthropicBiorisk \ red.anthropic.com5 Sept 2025 — In this post, we want to expand on our perspective on AI and biological risk (biori… [OpenAI]OpenAIcomOpenAI | Research & DeploymentWe believe our research will eventually lead to artificial general intelligence, a system that can solve…

Capability benchmarking

Evaluators also use controlled assessments to estimate scientific competence.

The UK AI Security Institute reported that frontier models have surpassed typical biology PhD performance on some internal biology question sets. [AI Security Institute]aisi.gov.ukearly lessons from evaluating frontier ai systemsAI Security InstituteEarly lessons from evaluating frontier AI systems | AISI Work24 Oct 2024 — We look into the evolving role of third-p…

That does not mean models can autonomously engineer pandemics. But it does suggest that biological reasoning capabilities are improving rapidly enough that static assumptions may fail quickly.

End-to-end workflow testing

More advanced evaluations increasingly ask whether models can support entire research pipelines rather than isolated tasks.

For example:

Can the model propose an experimental plan?
Can it adapt when constraints change?
Can it interpret ambiguous results?
Can it recommend next steps after failure?
Can it sustain strategic coherence over long interactions?

These evaluations matter because many biological dangers emerge from cumulative assistance rather than any single instruction.

Human uplift experiments

Some of the most policy-relevant evaluations involve real users.

Researchers may compare how well participants perform biological tasks with and without AI assistance. The core metric is whether the model significantly increases dangerous capability relative to baseline human performance.

This is controversial because fully realistic experiments create obvious safety and ethical problems. As a result, many current evaluations rely on proxies, partial simulations, or restricted tasks. [arXiv]arxiv.orgarXiv Risk thresholds for frontier AIarXiv Risk thresholds for frontier AI

That uncertainty is one reason many researchers argue for conservative thresholds rather than waiting for definitive proof of catastrophic misuse.

The hardest problem: defining the actual pause line

The central governance challenge is deciding what level of biological capability should trigger deployment restrictions or pauses.

In principle, the idea sounds straightforward: if a model becomes too dangerous, stop deployment until safeguards improve. In practice, the line is difficult to define because capability growth is gradual and biological risk is probabilistic.

Current frameworks therefore tend to use threshold categories rather than binary “safe/unsafe” labels.

OpenAI’s Preparedness Framework distinguishes between “High” and “Critical” capability thresholds. High thresholds involve substantial amplification of existing severe risks, while Critical thresholds involve qualitatively new pathways to catastrophic harm. [OpenAI]OpenAIfrontier risk and preparednesscomFrontier risk and preparedness26 Oct 2023 — We are developing our approach to catastrophic risk preparedness, including building a Pre… [OpenAI]cdn.openai.compreparedness framework v2Framework15 Apr 2025 — Critical capability thresholds mean capabilities that present a meaningful risk of a qualitatively new threat vect…

Anthropic’s Responsible Scaling Policy similarly links escalating capability levels to stronger security and operational requirements. [Anthropic]anthropic.comstrategic warning for ai risk progress and insights from our frontier red teamAnthropicProgress from our Frontier Red Team19 Mar 2025 — In this post, we are sharing what we have learned about the trajectory of poten…

Across frameworks, several candidate triggers recur.

A model substantially uplifts non-experts

One proposed pause condition is when ordinary technically literate users can achieve capabilities previously restricted to highly trained specialists.

This threshold matters because it implies dangerous expertise is becoming democratised faster than oversight systems can adapt.

Safeguards become unreliable under adversarial pressure

A second trigger is when dangerous assistance can be extracted consistently despite protections.

The UK AI Safety Institute and other evaluators have repeatedly shown that safeguards can often be bypassed through jailbreaking or prompt manipulation. [The Guardian]theguardian.comThe Guardian AI safeguards can easily be broken, UK Safety Institute findsThe institute's research revealed that AI safeguards could be easily bypassed using basic prompts or more sophisticated jailbreaking tech…

If a model’s underlying capabilities become highly dangerous while protections remain fragile, deployment risk rises sharply.

Evaluators cannot confidently bound risk

Another proposed threshold is epistemic rather than capability-based: deployment should pause when evaluators no longer understand the model’s dangerous behaviours well enough to justify release.

This is especially relevant for systems that display emergent reasoning abilities, strategic planning, or autonomous scientific workflows.

Biosecurity illustration 2

Dangerous capability outpaces defensive infrastructure

Some researchers argue thresholds should depend not only on offensive capability but also on whether society has adequate defensive systems.

For example:

Can misuse be detected quickly?
Are DNA synthesis screening systems robust?
Are laboratory reporting systems effective?
Can intelligence agencies track misuse pathways?
Are international response mechanisms credible?

Under this view, deployment may be unacceptable even if a model is not fully catastrophic on its own because broader institutions are unprepared.

Why biology may justify earlier pauses than other domains

Many frontier AI debates compare biological risk with cyber risk. Both involve information-based harm amplified by automation. But biology has several characteristics that may justify stricter thresholds.

Biological harm can scale silently

Cyberattacks are often detectable quickly. Biological threats may spread invisibly for weeks before recognition.

Pandemics also create nonlinear effects: health system overload, political instability, economic disruption, and global coordination failures.

Defensive cycles are slower

Software vulnerabilities can sometimes be patched rapidly. Biological countermeasures usually require slower processes: surveillance, testing, vaccine development, manufacturing, and distribution.

AI may accelerate defensive biology too, but there is no guarantee defence and offence scale equally.

Tacit knowledge barriers may erode suddenly

Historically, advanced biological work depended heavily on expert mentorship and institutional environments.

A highly capable AI tutor could weaken those barriers by providing personalised guidance continuously and cheaply. That possibility is central to many current biosecurity concerns. [Red Anthropic]red.anthropic.comRed AnthropicBiorisk \ red.anthropic.com5 Sept 2025 — In this post, we want to expand on our perspective on AI and biological risk (biori… [OpenAI]OpenAIpreparing for future ai capabilities in biology18 Jun 2025 — We complement these measures with always-on Detection & Response, dedicated Threat Intelligence, and an Insider-Risk progra…

Open diffusion creates persistent exposure

Once highly capable models or weights spread widely, reversing access becomes difficult.

That makes biosecurity thresholds partly about timing. A temporary pause before broad release may be one of the few opportunities to strengthen safeguards before diffusion becomes irreversible.

The case against aggressive deployment pauses

Not everyone agrees that biosecurity thresholds should halt frontier deployment.

Critics raise several important objections.

Biosecurity illustration 3

Current evidence remains limited

Many dangerous capability claims remain speculative. Public evidence that existing models dramatically increase real-world biological threat capability is still limited.

Some evaluations show impressive biology reasoning without demonstrating reliable real-world operational competence. [arXiv]arxiv.orgarXiv Risk thresholds for frontier AIarXiv Risk thresholds for frontier AI

Critics argue that premature deployment freezes could overreact to hypothetical scenarios.

Defensive benefits may outweigh offensive risk

Advanced AI could also strengthen biosecurity:

Faster vaccine design.
Better outbreak modelling.
Accelerated diagnostics.
Automated pathogen surveillance.
Improved protein modelling.
Faster medical research.

Under this argument, slowing beneficial AI systems could itself increase long-run risk by delaying scientific progress.

This tension is especially important within the broader AI bloom perspective. The same systems that might increase misuse risk could also radically improve medicine, longevity, and pandemic resilience.

Thresholds may be too subjective

Capability evaluations remain noisy and incomplete. Different evaluators may reach different conclusions about the same model.

Some researchers therefore worry that thresholds could become politicised or inconsistently applied.

Competitive pressure may undermine pauses

Even companies that publicly support thresholds face incentives to continue scaling if rivals do not stop.

Anthropic’s 2026 revisions to its Responsible Scaling Policy drew attention partly because critics argued the company weakened earlier commitments to pause development under certain conditions. [PC Gamer]pcgamer.comPreviously, under its Responsible Scaling Policy (RSP), Anthropic pledged to halt AI development should new systems reach dangerous capab…

This highlights a broader governance problem: unilateral restraint may be unstable without wider international coordination.

What a credible biosecurity threshold regime would require

A meaningful threshold system cannot rely on vague promises alone. For pauses to work credibly, several conditions likely need to exist simultaneously.

Independent evaluations

Labs should not be the sole judges of their own systems.

External evaluators, including institutions like the UK AI Security Institute, are increasingly important because they provide adversarial testing outside commercial incentives. [AI Security Institute]aisi.gov.ukearly lessons from evaluating frontier ai systemsAI Security InstituteEarly lessons from evaluating frontier AI systems | AISI Work24 Oct 2024 — We look into the evolving role of third-p… [AI Security Institute]aisi.gov.ukearly lessons from evaluating frontier ai systemsAI Security InstituteEarly lessons from evaluating frontier AI systems | AISI Work24 Oct 2024 — We look into the evolving role of third-p…

Pre-committed trigger conditions

Thresholds become weaker if companies can reinterpret them after models become commercially valuable.

Clear advance commitments reduce the temptation to redefine danger reactively.

Strong model security

Some frameworks increasingly focus not only on deployment but also on protecting model weights from theft or exfiltration. [OpenAI]OpenAIupdating our preparedness frameworkOur updated Preparedness Framework15 Apr 2025 — We've streamlined levels to two clear thresholds that map to specific operational commitm…

This matters because dangerous capabilities may spread through leaks even without public release.

International coordination

Biological risk is global. A threshold regime that applies only to one company or country may fail if competitors continue deploying equally capable systems elsewhere.

Continuous rather than one-off testing

Capability growth is fast and uneven. Systems may gain dangerous abilities unexpectedly through scaling, tool integration, or fine-tuning.

That means biosecurity evaluations cannot be treated as static certification exercises.

Why this debate matters for humanity’s long-term future

The argument over biosecurity thresholds is ultimately about whether civilisation can navigate a transition to far more powerful AI systems without losing control of catastrophic risks along the way.

The optimistic vision of AI bloom depends on advanced AI becoming a force for flourishing: accelerating medicine, reducing scarcity, expanding scientific understanding, and potentially helping humanity achieve forms of prosperity and resilience previously impossible.

But those gains depend on avoiding irreversible failures during the transition period.

Biosecurity thresholds are one attempt to operationalise that caution. They ask whether there are some capability levels where the downside risk becomes too large for ordinary deployment logic to remain acceptable. The core idea is not anti-progress. It is that preserving the possibility of a flourishing long-term future may require slowing or pausing specific forms of deployment until humanity’s defensive capacity catches up with its offensive technological power.

Endnotes

Source: cdn.openai.com
Title: preparedness framework v2
Link: https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbddebcd/preparedness-framework-v2.pdf
Source snippet
Framework15 Apr 2025 — Critical capability thresholds mean capabilities that present a meaningful risk of a qualitatively new threat vect...
Source: arxiv.org
Title: arXiv Risk thresholds for frontier AI
Link: https://arxiv.org/abs/2406.14713
Source: linkedin.com
Link: https://www.linkedin.com/posts/sophie-kathryn-williams_last-week-anthropics-responsible-scaling-activity-7435383884797874177-YhCE
Source snippet
LinkedInAnthropic Updates Responsible Scaling Policy with...Last week, Anthropic's Responsible Scaling Policy (RSP) underwent its most s...
Source: red.anthropic.com
Link: https://red.anthropic.com/2025/biorisk/
Source snippet
Red AnthropicBiorisk \ red.anthropic.com5 Sept 2025 — In this post, we want to expand on our perspective on AI and biological risk (biori...
Source: arxiv.org
Title: arXiv Evaluating Frontier Models for Dangerous Capabilities
Link: https://arxiv.org/abs/2403.13793
Source: anthropic.com
Title: strategic warning for ai risk progress and insights from our frontier red team
Link: https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team
Source snippet
AnthropicProgress from our Frontier Red Team19 Mar 2025 — In this post, we are sharing what we have learned about the trajectory of poten...
Source: OpenAI
Title: preparing for future ai capabilities in biology
Link: https://openai.com/index/preparing-for-future-ai-capabilities-in-biology/
Source snippet
18 Jun 2025 — We complement these measures with always-on Detection & Response, dedicated Threat Intelligence, and an Insider-Risk progra...
Source: OpenAI
Title: updating our preparedness framework
Link: https://openai.com/index/updating-our-preparedness-framework/
Source snippet
Our updated Preparedness Framework15 Apr 2025 — We've streamlined levels to two clear thresholds that map to specific operational commitm...
Source: anthropic.com
Title: responsible scaling policy v3
Link: https://www.anthropic.com/news/responsible-scaling-policy-v3
Source snippet
AnthropicResponsible Scaling Policy Version 3.0Feb 24, 2026 — The RSP is our attempt to solve the problem of how to address AI risks that...
Source: www-cdn.anthropic.com
Link: https://www-cdn.anthropic.com/17310f6d70ae5627f55313ed067afc1a762a4068.pdf
Source snippet
Anthropic's Responsible Scaling Policy (version 2.1)Mar 31, 2025 — AI Safety Level Standards (ASL Standards) are a set of technical and o...
Source: time.com
Title: uk ai safety institute
Link: https://time.com/7204670/uk-ai-safety-institute/
Source snippet
This led to the establishment of the UK's AI Safety Institute (AISI) in November 2023, with a mandate to evaluate the risks of new AI mod...

Published: November 2023
Source: OpenAI
Link: https://openai.com/
Source snippet
comOpenAI | Research & DeploymentWe believe our research will eventually lead to artificial general intelligence, a system that can solve...
Source: OpenAI
Title: frontier risk and preparedness
Link: https://openai.com/index/frontier-risk-and-preparedness/
Source snippet
comFrontier risk and preparedness26 Oct 2023 — We are developing our approach to catastrophic risk preparedness, including building a Pre...
Source: anthropic.com
Title: announcing our updated responsible scaling policy
Link: https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy
Source snippet
15 Oct 2024 — This update introduces a more flexible and nuanced approach to assessing and managing AI risks while maintaining our commit...
Source: arxiv.org
Link: https://arxiv.org/html/2511.05526v1
Source snippet
Emergency Response Measures for Catastrophic AI Risk28 Oct 2025 — Similarly, OpenAI's Preparedness Framework sets thresholds for biologic...
Source: arxiv.org
Link: https://arxiv.org/abs/2509.24394
Source snippet
[2509.24394] The 2025 OpenAI Preparedness Framework...by S Coggins · 2025 · Cited by 2 —...
Source: arxiv.org
Link: https://arxiv.org/pdf/2509.24394
Source snippet
For prioritised risks, the Preparedness Framework encourages OpenAI's CEO to deploy.Read m...
Source: linkedin.com
Link: https://www.linkedin.com/pulse/openais-preparedness-framework-red-marble-ai-vfvtc
Source snippet
OpenAI's preparedness frameworkBut its focus is on catastrophic risk, defined as any risk which could result in hundreds of billions of d...
Source: linkedin.com
Link: https://www.linkedin.com/pulse/medium-risk-ai-facilitating-biological-threats-gianluca-mondillo-md-5gxdf
Source snippet
Medium Risk of AI in Facilitating Biological ThreatsBest Practices for AI Threat Modeling. 10 Posts. 1,419 · How to Respond When AI Model...
Source: linkedin.com
Link: https://www.linkedin.com/posts/tesssbuckley_today-uks-ai-security-institute-of-department-activity-7407352566029828097-ZTJf
Source snippet
UK AI Security Institute Publishes Frontier AI Trends ReportAs the first public analysis of trends by AISI it draws on two years' worth o...
Source: linkedin.com
Link: https://www.linkedin.com/pulse/openais-preparedness-framework-scaling-ai-responsibly-cyril-bhr4e
Source snippet
OPENAI'S PREPAREDNESS FRAMEWORK: SCALING...Race to the Bottom on Safety: The document highlights the concept of “marginal risk”—the dang...
Source: governance.ai
Title: ‍.Read more
Link: https://www.governance.ai/analysis/anthropics-rsp-v3-0-how-it-works-whats-changed-and-some-reflections
Source snippet
Anthropic's RSP v3.0: How it Works, What's Changed, and...Mar 17, 2026 — Anthropic's Responsible Scaling Policy (RSP) – its framework fo...
Source: youtube.com
Title: Yoshua Bengio Warns AI Biosecurity Risks Surpass Safety Thresholds
Link: https://www.youtube.com/watch?v=sfK1zuWBuyo
Source snippet
Anthropic's Plan to Stop AI Bioweapons & Autonomous Misuse...
Source: youtube.com
Title: Anthropic’s Plan to Stop AI Bioweapons & Autonomous Misuse
Link: https://www.youtube.com/watch?v=n5h1GNvzqIg
Source snippet
AI can bypass biosecurity safeguards to recreate deadly toxins, researchers say...
Source: frontiermodelforum.org
Title: risk taxonomy and thresholds
Link: https://www.frontiermodelforum.org/technical-reports/risk-taxonomy-and-thresholds/
Source snippet
Frontier Model ForumRisk Taxonomy and Thresholds for Frontier AI FrameworksJun 18, 2025 — Frontier AI frameworks outline methodologies fo...
Source: frontiermodelforum.org
Title: frontier capability assessments
Link: https://www.frontiermodelforum.org/technical-reports/frontier-capability-assessments/
Source snippet
Frontier Model ForumFrontier Capability AssessmentsApr 22, 2025 — Frontier Capability Assessments are procedures conducted on frontier mo...
Source: aisi.gov.uk
Title: early lessons from evaluating frontier ai systems
Link: https://www.aisi.gov.uk/blog/early-lessons-from-evaluating-frontier-ai-systems
Source snippet
AI Security InstituteEarly lessons from evaluating frontier AI systems | AISI Work24 Oct 2024 — We look into the evolving role of third-p...
Source: aisi.gov.uk
Link: https://www.aisi.gov.uk/blog/5-key-findings-from-our-first-frontier-ai-trends-report
Source snippet
AI Security Institute5 key findings from our first Frontier AI Trends Report18 Dec 2025 — In 2024, we first tested a model to surpass bio...
Source: aisi.gov.uk
Link: https://www.aisi.gov.uk/frontier-ai-trends-report
Source snippet
AI Security InstituteFrontier AI Trends Report by The AI Security Institute (AISI)The UK AI Security Institute (AISI) has conducted evalu...
Source: theguardian.com
Title: The Guardian AI safeguards can easily be broken, UK Safety Institute finds
Link: https://www.theguardian.com/technology/2024/feb/09/ai-safeguards-can-easily-be-broken-uk-safety-institute-finds
Source snippet
The institute's research revealed that AI safeguards could be easily bypassed using basic prompts or more sophisticated jailbreaking tech...
Source: pcgamer.com
Link: https://www.pcgamer.com/software/ai/anthropic-ditches-its-defining-safety-promise-to-pause-dangerous-ai-development-because-its-basically-pointless-when-everybody-else-is-blazing-ahead/
Source snippet
Previously, under its Responsible Scaling Policy (RSP), Anthropic pledged to halt AI development should new systems reach dangerous capab...
Source: metr.org
Link: https://metr.org/fsp
Source snippet
Frontier AI Safety PoliciesPublished company policies; Anthropic logo. Responsible Scaling Policy, v3.0. February 24, 2026; OpenAI logo...

Published: February 24, 2026
Source: GOV.UK
Title: ai security institute frontier ai trends report factsheet
Link: https://www.gov.uk/government/publications/ai-security-institute-frontier-ai-trends-report-factsheet
Source snippet
Security Institute – Frontier AI Trends report factsheet18 Dec 2025 — The UK AI Security Institute (AISI) has conducted evaluations of...
Source: GOV.UK
Title: ai security institute frontier ai trends report factsheet
Link: https://www.gov.uk/government/publications/ai-security-institute-frontier-ai-trends-report-factsheet/ai-security-institute-frontier-ai-trends-report-factsheet
Source snippet
It brings...Read more...
Source: frontiermodelforum.org
Title: issue brief components of frontier ai safety frameworks
Link: https://www.frontiermodelforum.org/updates/issue-brief-components-of-frontier-ai-safety-frameworks/
Source snippet
Issue Brief: Components of Frontier AI Safety Frameworks8 Nov 2024 — Frontier AI safety frameworks are designed to enable developers to t...
Source: forum.effectivealtruism.org
Title: openai preparedness framework
Link: https://forum.effectivealtruism.org/posts/p6Wccw2Gg3ESLMvRr/openai-preparedness-framework
Source snippet
effectivealtruism.orgOpenAI: Preparedness framework18 Dec 2023 — The framework is explicitly about catastrophic risk, and indeed it's cle...
Source: digital.nemko.com
Title: anthropic ai safety strategy what enterprises must know
Link: https://digital.nemko.com/news/anthropic-ai-safety-strategy-what-enterprises-must-know
Source snippet
details Responsible Scaling Policy for frontier AIAug 25, 2025 — Anthropic's Responsible Scaling Policy is designed to keep risk “below a...
Source: thezvi.wordpress.com
Title: openai preparedness framework 2 0
Link: https://thezvi.wordpress.com/2025/05/02/openai-preparedness-framework-2-0/
Source snippet
The Cybersecurity thresholds are reworded but essentially unchanged...Read more...
Source: aisecurityandsafety.org
Title: anthropic rsp vs openai preparedness framework
Link: https://aisecurityandsafety.org/en/compare/anthropic-rsp-vs-openai-preparedness-framework/
Source snippet
Anthropic Responsible Scaling Policy vs OpenAI...13 Apr 2026 — Anthropic Responsible Scaling Policy emphasizes requirements such as "Ass...
Source: mlq.ai
Link: https://mlq.ai/news/anthropic-releases-revised-responsible-scaling-policy-30-with-adjusted-safety-commitments/
Source snippet
Anthropic Releases Revised Responsible Scaling Policy...Mar 2, 2026 — Anthropic updated its Responsible Scaling Policy (RSP) to Version...
Source: youtube.com
Link: https://www.youtube.com/watch?v=GVE2zPtHZvY
Source snippet
OpenAI's New Safety Preparedness FrameworkHow OpenAI is going to handle the safety challenges of frontier models. The AI Breakdown helps...
Source: ainews.com
Link: https://www.ainews.com/p/anthropic-revises-ai-safety-policy-with-risk-reports-external-review-and-new-transparency-rules
Source snippet
ring, external review, and greater transparency as AI capabilities advance.Read more...
Source: thezvi.substack.com
Link: https://thezvi.substack.com/p/on-openais-preparedness-framework
Source snippet
OpenAI's Preparedness Framework - by Zvi MowshowitzIt describes OpenAI's processes to track, evaluate, forecast, and protect against cata...

Additional References

Source: researchgate.net
Link: https://www.researchgate.net/publication/395968831_The_2025_OpenAI_Preparedness_Framework_does_not_guarantee_any_AI_risk_mitigation_practices_a_proof-of-concept_for_affordance_analyses_of_AI_safety_policies
Source snippet
The 2025 OpenAI Preparedness Framework does not...20 Sept 2025 — These statements purport to establish risk thresholds and safety proced...
Source: techuk.org
Link: https://www.techuk.org/resource/how-the-ai-safety-institute-is-approaching-evaluations.html
Source snippet
How the AI Safety Institute is approaching evaluationsModels selected for evaluation will be based on the estimated risk of a system's ha...
Source: atlas.latticeflow.ai
Link: https://atlas.latticeflow.ai/framework/openai-preparedness-framework-v2/
Source snippet
/ Preparedness Framework v2Framework for evaluating and managing catastrophic risks from frontier models. Defines High/Critical capabilit...
Source: inspect.aisi.org.uk
Link: https://inspect.aisi.org.uk/
Source snippet
AIWelcome. Inspect is a framework for frontier AI evaluations developed by the UK AI Security Institute and Meridian Labs. Inspect can be...
Source: assets.publishing.service.gov.uk
Link: https://assets.publishing.service.gov.uk/media/653aabbd80884d000df71bdc/emerging-processes-frontier-ai-safety.pdf
Source snippet
Processes for Frontier AI SafetyAssessments like model evaluations and red teaming could help to understand the risks frontier AI systems...
Source: ratings.safer-ai.org
Link: https://ratings.safer-ai.org/company/openai/
Source snippet
– Risk Management RatingsHigh capability thresholds mean capabilities that significantly increase existing risk vectors for severe harm...
Source: longtermresilience.org
Link: https://www.longtermresilience.org/why-we-recommend-risk-assessments-over-evaluations-for-ai-enabled-biological-tools-bts/
Source snippet
Why we recommend risk assessments over evaluations...27 Mar 2024 — Despite the central role of UK Government-led model evaluations for f...
Source: transformernews.ai
Title: aisi ai security institute frontier ai trends report biorisk self replication
Link: https://www.transformernews.ai/p/aisi-ai-security-institute-frontier-ai-trends-report-biorisk-self-replication
Source snippet
AI is making dangerous lab work accessible to novices...Dec 18, 2025 — AI models are rapidly improving at potentially dangerous biologic...
Source: techuk.org
Title: uk ai security institute releases inaugural frontier ai trends report
Link: https://www.techuk.org/resource/uk-ai-security-institute-releases-inaugural-frontier-ai-trends-report.html
Source snippet
UK AI Security Institute releases inaugural Frontier AI...18 Dec 2025 — The report is based on a series of wide-ranging evaluations of o...
Source: longtermresilience.org
Title: how the uk government can govern the risk of loss of control
Link: https://www.longtermresilience.org/reports/how-the-uk-government-can-govern-the-risk-of-loss-of-control/
Source snippet
3 Feb 2026 — The AI Security Institute is working on new benchmarks for assessing these risks, undertaking world-leading research on sche...

Amazon book picks