Cyber agents

Introduction

Autonomous cyber agents matter to frontier AI safety because they change the tempo of cyber conflict. Traditional cybersecurity assumes that defenders usually have time to discover a flaw, issue a patch, distribute updates, and recover from attacks before damage spreads too far. Advanced AI agents threaten that assumption. If systems can autonomously find vulnerabilities, write exploits, adapt to defences, and coordinate attacks across thousands of targets at machine speed, then the familiar “patch and respond” model may stop working fast enough.

Cyber agents illustration 1 This is one reason frontier AI discussions increasingly focus on capability thresholds rather than ordinary product safety. The concern is not merely that AI could help hackers write better phishing emails. It is that sufficiently capable agents could compress months of offensive cyber work into hours or minutes, potentially overwhelming the slower institutions that modern digital infrastructure depends on. Within the wider debate about AI abundance and humanity’s long-term future, avoiding this kind of runaway instability becomes part of preserving the conditions for scientific progress, prosperity, and civilisational resilience. [International AI Safety Report]internationalaisafetyreport.orginternational ai safety report 2026International AI Safety ReportInternational AI Safety Report 2026Feb 3, 2026 — This Report assesses what general-purpose AI systems can d… [Frontier Model Forum]frontiermodelforum.orgmanaging advanced cyber risks in frontier ai frameworksFrontier Model ForumManaging Advanced Cyber Risks in Frontier AI Frameworks13 Feb 2026 — Frontier AI offers significant promise for cyber…

Why speed changes cyber risk

The central issue is not just capability, but capability combined with automation and scale.

A human-led cyberattack is constrained by labour, attention, coordination costs, and time. Even sophisticated state-backed operations usually require teams of specialists working through reconnaissance, exploit development, testing, persistence, and operational security. Defensive systems evolved around this reality. Security teams patch critical systems over days or weeks because most attackers cannot instantly compromise every vulnerable machine on Earth.

[Autonomous cyber agents could weaken all of those assumptions simultaneously.]aisi.gov.ukGPT-5.5 is one of the strongest models we have tested on our cyber tasks and is the second model to solve…

Modern frontier models are already showing rapid improvement on complex cyber tasks. The UK AI Security Institute (AISI) reported in 2026 that the length of cyber tasks frontier models can autonomously complete has been doubling every few months, with recent systems outperforming earlier trend estimates. [AI Security Institute]aisi.gov.ukAI Security InstituteFrontier AI Trends Report by The AI Security Institute (AISI)The UK AI Security Institute (AISI) has conducted evalu…

That matters because cyber defence is heavily dependent on delay. A vulnerability often becomes catastrophic only when attackers can exploit it faster than organisations can react. The danger scenario is not “AI writes malware once”. It is:

AI systems autonomously discover or chain together vulnerabilities.
They generate working exploits at very large scale.
Attacks propagate across poorly patched systems.
Defenders cannot triage, validate, patch, and recover quickly enough.

In ordinary software security, defenders often survive because attackers face bottlenecks. Autonomous agents may remove many of them.

The shrinking gap between discovery and exploitation

Historically, there has often been a usable delay between a vulnerability being discovered and large-scale exploitation. That delay allows emergency patching, public advisories, temporary mitigations, and coordinated response.

AI-assisted cyber operations could compress that window dramatically.

The Frontier Model Forum notes that frontier systems may accelerate vulnerability discovery and exploit development while simultaneously lowering barriers for malicious actors. [Frontier Model Forum]frontiermodelforum.orgmanaging advanced cyber risks in frontier ai frameworksFrontier Model ForumManaging Advanced Cyber Risks in Frontier AI Frameworks13 Feb 2026 — Frontier AI offers significant promise for cyber… Public and private evaluations increasingly focus on whether models can independently execute long attack chains rather than merely answer isolated cybersecurity questions.

AISI’s multi-step cyber range tests are especially important because they move beyond chatbot-style prompting. Researchers tested frontier models inside simulated corporate and industrial environments requiring reconnaissance, privilege escalation, lateral movement, credential handling, and chained exploitation across dozens of steps. [AI Security Institute]aisi.gov.ukAI Security InstituteFrontier AI Trends Report by The AI Security Institute (AISI)The UK AI Security Institute (AISI) has conducted evalu…

The results remain limited compared with elite human operators, but the trend line worries safety researchers. In one evaluation, average completed attack steps rose sharply across successive model generations, and performance improved further when models were allowed larger inference budgets. [arXiv]arxiv.orgMeasuring AI Agents' Progress on Multi-Step Cyber Attack…13 Mar 2026 — We evaluate the autonomous cyber-attack capabilities of frontie…

That creates a dangerous asymmetry. Defenders must secure enormous software ecosystems continuously. Attackers only need a few successful paths.

Machine-speed iteration

One underappreciated change is that autonomous agents can repeatedly test and refine attacks without human fatigue.

Human penetration testers may attempt a few exploit chains per day. Autonomous systems can potentially run thousands of iterations, learn from failures, modify tooling, and continue searching for alternative paths. Even if each individual attempt is imperfect, sheer speed changes the economics.

Researchers and security organisations increasingly describe this as a transition from “tool assistance” to “agentic operations”. [Unit 42]unit42.paloaltonetworks.comUnit 42Frontier AI and the Future of Defense: Your Top QuestionsApril 23, 2026 — 23 Apr 2026 — Unlike LLMs used for basic content generation, frontier models can autonomously identify software vulnerab…Published: April 23, 2026 The concern is not that every AI model becomes an unstoppable super-hacker overnight. It is that attackers gain scalable autonomous labour.

This matters especially for critical infrastructure and legacy systems:

Hospitals often run old software that cannot be patched quickly.
Industrial control systems may require shutdowns before updates.
Utilities and transport networks contain complicated vendor dependencies.
Small organisations lack dedicated security teams entirely.

A world where attackers gain machine-speed offensive iteration before defenders gain equally effective automated defence could produce repeated systemic crises.

What dangerous cyber capability evaluations test

Because ordinary benchmarks reveal little about real operational danger, frontier AI evaluations increasingly test autonomous behaviour under realistic constraints.

The core question is no longer simply: “Can the model explain hacking concepts?”

Instead, evaluators ask:

Can it independently plan attacks?
Can it recover from failed steps?
Can it maintain long operational sequences?
Can it adapt to changing environments?
Can it discover vulnerabilities humans did not explicitly describe?

This is why frontier safety frameworks emphasise “dangerous capability evaluations”. [GOV.UK]GOV.UKai safety institute approach to evaluationsFeb 9, 2024 — AI Safety Institute (AISI) approach to evaluations and testing of advanced AI systems to better understand what each new sy…

Multi-step cyber ranges

AISI’s cyber evaluations use controlled attack simulations rather than trivia-style tests. Their published work describes corporate-network and industrial-control-system environments where models must chain together many different actions over extended time horizons. [AI Security Institute]aisi.gov.ukAI Security InstituteFrontier AI Trends Report by The AI Security Institute (AISI)The UK AI Security Institute (AISI) has conducted evalu…

These evaluations matter because many real cyberattacks fail not on technical knowledge but on operational coordination. An autonomous agent that can:

persist through errors,
keep track of objectives,
select tools,
and dynamically alter plans

is qualitatively different from a chatbot that merely suggests commands.

Recent AISI evaluations of systems such as GPT-5.5 and Claude Mythos reported the strongest cyber performance yet observed in their testing programmes. [AI Security Institute]aisi.gov.ukAI Security InstituteFrontier AI Trends Report by The AI Security Institute (AISI)The UK AI Security Institute (AISI) has conducted evalu…

Importantly, most public evaluations still show substantial limitations. Models remain unreliable, brittle, and inconsistent in many scenarios. Industrial control environments in particular remain difficult. [arXiv]arxiv.orgMeasuring AI Agents' Progress on Multi-Step Cyber Attack…13 Mar 2026 — We evaluate the autonomous cyber-attack capabilities of frontie… But frontier safety discussions focus heavily on trajectory rather than current perfection. If capabilities continue compounding quickly, institutions may have little warning before systems cross from “useful assistant” to “serious offensive multiplier”.

Autonomous vulnerability discovery

Another major concern is AI-assisted discovery of previously unknown vulnerabilities.

Microsoft’s 2026 announcement of MDASH, an agentic vulnerability-discovery platform coordinating more than 100 specialised AI agents, illustrates the dual-use nature of the technology. The system reportedly identified numerous previously unknown Windows flaws. [TechRadar]techradar.comTech Radar Microsoft unveils MDASH, its AI agent-driven security platformMDASH coordinates over 100 specialized AI agents to detect software vulnerabilities, and it has already discovered 16 previously unknown…

This demonstrates both the promise and the danger:

AI could dramatically improve defensive auditing.
The same capability could accelerate offensive discovery.

DARPA’s AI Cyber Challenge (AIxCC) was built around exactly this tension. The programme aimed to create autonomous systems capable of identifying and patching vulnerabilities at scale in real-world software. [AI Cyber Challenge]darpa.milaixcc resultsAI Cyber Challenge marks pivotal inflection point for cyber…8 Aug 2025 — Teams' AI-driven systems find, patch real-world cyber vulnera…

The optimistic interpretation is powerful: AI systems may eventually secure global software infrastructure faster than humans can. The worrying interpretation is that offensive capability may diffuse more quickly than robust defensive deployment.

Cybersecurity history suggests attackers often exploit automation earlier and more aggressively than defenders adopt coordinated upgrades.

Cyber agents illustration 2

Evaluating capability before deployment

Frontier AI governance proposals increasingly argue that high-risk cyber evaluations should happen before public release, not after incidents.

The logic resembles aircraft stress testing or pharmaceutical trials:

once a dangerous capability is widely deployed,
rollback may become difficult,
and copycat proliferation may be impossible to reverse.

Several frontier frameworks therefore propose escalating safeguards once models reach concerning cyber thresholds. [Frontier Model Forum]frontiermodelforum.orgmanaging advanced cyber risks in frontier ai frameworksFrontier Model ForumManaging Advanced Cyber Risks in Frontier AI Frameworks13 Feb 2026 — Frontier AI offers significant promise for cyber…

Potential triggers include:

autonomous exploit chaining,
high success rates on complex attack simulations,
scalable vulnerability discovery, [frontiermodelforum.org]frontiermodelforum.orgmanaging advanced cyber risks in frontier ai frameworksFrontier Model ForumManaging Advanced Cyber Risks in Frontier AI Frameworks13 Feb 2026 — Frontier AI offers significant promise for cyber…
or demonstrated persistence in adversarial environments.

The point is not that every capable model must be banned. Rather, developers may need stronger controls once systems begin approaching offensive capabilities that institutions cannot easily contain.

How containment and access limits could reduce harm

Cyber-risk thresholds are partly about buying time.

If autonomous offensive capability begins improving faster than global patching capacity, slowing deployment or restricting access may reduce the probability of irreversible failures while defensive systems adapt.

Limiting open access to dangerous capabilities

One proposed safeguard is tiered access control.

A model capable of advanced autonomous cyber operations may not be released through unrestricted public APIs or downloadable weights. Instead:

access may require identity verification,
monitoring,
rate limits,
restricted tooling,
or tightly controlled research partnerships.

Anthropic’s decision not to broadly release its “Mythos” model after concerning cyber evaluations reflects this logic. Reports indicated that the company instead granted limited access to selected institutions for controlled testing. [The Guardian]theguardian.comMythos possesses advanced capabilities in identifying previously unknown flaws in IT systems, potentially exploitable by hackers. Due to…

Critics argue that such restrictions concentrate power inside a few large firms and governments. Supporters counter that unrestricted proliferation of highly autonomous offensive systems could create risks comparable to releasing advanced digital weapons infrastructure to anyone with internet access.

Cyber agents illustration 3

Hardening infrastructure before capability diffusion

Another argument for thresholds is that society may need time to improve baseline cyber resilience.

The National Cyber Security Centre and related UK institutions increasingly emphasise that frontier AI changes the urgency of longstanding security weaknesses. [National Cyber Security Centre]ncsc.gov.ukwhy cyber defenders need to be ready for frontier aiNational Cyber Security CentreWhy cyber defenders need to be ready for frontier AI30 Mar 2026 — Recent findings from the AI Security Inst…

Many organisations still struggle with:

delayed patching,
poor network segmentation,
weak authentication,
legacy systems,
and limited monitoring.

These weaknesses are already dangerous under human-led attack conditions. Highly autonomous offensive agents could amplify them dramatically.

In this view, temporary deployment constraints are not meant to halt technological progress forever. They are meant to prevent offensive capability growth from racing too far ahead of institutional adaptation.

Defensive AI may help — but timing matters

There is also a strong optimistic case.

The same technologies enabling autonomous attack could produce:

continuous automated auditing,
machine-speed patch generation,
adaptive defence systems,
and large-scale vulnerability remediation.

DARPA’s AIxCC was motivated partly by the idea that autonomous systems could help secure critical infrastructure faster than human teams alone. [darpa.mil]darpa.milaixcc resultsAI Cyber Challenge marks pivotal inflection point for cyber…8 Aug 2025 — Teams' AI-driven systems find, patch real-world cyber vulnera…

This possibility matters for the broader AI bloom thesis. A civilisation with vastly improved cyber resilience could support more ambitious scientific, medical, and economic systems safely. Advanced AI could eventually make digital infrastructure far more secure than today’s fragile patchwork environment.

But timing remains crucial.

If offensive capability scales faster than defensive deployment, societies may experience destabilising periods before defensive automation catches up. Frontier safety thresholds are partly an attempt to manage that transition rather than assuming markets or patch cycles will naturally keep pace.

Why this mechanism matters for the long-term AI future

Autonomous cyber agents are not usually discussed because cybercrime itself is civilisation-ending. The deeper concern is that digital infrastructure underpins nearly everything else advanced societies depend on.

Finance, energy, logistics, communications, healthcare, cloud computing, industrial control systems, scientific research, and increasingly AI development itself all rely on interconnected software ecosystems.

A future in which advanced AI accelerates medicine, science, abundance, and human flourishing also becomes a future that depends even more heavily on stable digital systems. If autonomous offensive agents repeatedly overwhelm those systems, the broader optimistic vision becomes harder to sustain.

This is why frontier AI safety debates increasingly treat cyber thresholds as part of preserving the possibility of long-term human flourishing rather than as a narrow technical issue. The central question is not whether cyberattacks are new. It is whether advanced AI could push attack capability beyond the speed at which human institutions can reliably respond.

If that threshold is crossed before defensive systems mature, ordinary patching may stop being an adequate civilisational safeguard. [International AI Safety Report]internationalaisafetyreport.orginternational ai safety report 2026International AI Safety ReportInternational AI Safety Report 2026Feb 3, 2026 — This Report assesses what general-purpose AI systems can d… [AI Security Institute]aisi.gov.ukAI Security InstituteFrontier AI Trends Report by The AI Security Institute (AISI)The UK AI Security Institute (AISI) has conducted evalu…

Endnotes

Source: aisi.gov.uk
Link: https://www.aisi.gov.uk/frontier-ai-trends-report
Source snippet
AI Security InstituteFrontier AI Trends Report by The AI Security Institute (AISI)The UK AI Security Institute (AISI) has conducted evalu...
Source: aisi.gov.uk
Link: https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber-capability-advancing
Source snippet
AI Security InstituteHow fast is autonomous AI cyber capability advancing?4 days ago — The length of tasks frontier models can autonomous...
Source: aisi.gov.uk
Title: how do frontier ai agents perform in multi step cyber attack scenarios
Link: https://www.aisi.gov.uk/blog/how-do-frontier-ai-agents-perform-in-multi-step-cyber-attack-scenarios
Source snippet
AI Security InstituteHow do frontier AI agents perform in multi-step cyber-attack...16 Mar 2026 — We tested seven large language models...
Source: arxiv.org
Link: https://arxiv.org/html/2603.11214v2
Source snippet
Measuring AI Agents' Progress on Multi-Step Cyber Attack...13 Mar 2026 — We evaluate the autonomous cyber-attack capabilities of frontie...
Source: arxiv.org
Title: arXiv Measuring AI Agents’ Progress on Multi-Step Cyber Attack Scenarios
Link: https://arxiv.org/abs/2603.11214
Source snippet
arXivMeasuring AI Agents' Progress on Multi-Step Cyber Attack ScenariosMarch 11, 2026...

Published: March 11, 2026
Source: GOV.UK
Title: ai safety institute approach to evaluations
Link: https://www.gov.uk/government/publications/ai-safety-institute-approach-to-evaluations
Source snippet
Feb 9, 2024 — AI Safety Institute (AISI) approach to evaluations and testing of advanced AI systems to better understand what each new sy...
Source: aisi.gov.uk
Link: https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities
Source snippet
AI Security InstituteOur evaluation of Claude Mythos Preview's cyber capabilities5 days ago — We have tracked AI cyber capabilities since...
Source: aisi.gov.uk
Title: our evaluation of openais gpt 5 5 cyber capabilities
Link: https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities
Source snippet
AI Security InstituteOur evaluation of OpenAI's GPT-5.5 cyber capabilities30 Apr 2026 — GPT-5.5 is one of the strongest models we have te...
Source: techradar.com
Title: Tech Radar Microsoft unveils MDASH, its AI agent-driven security platform
Link: https://www.techradar.com/pro/security/microsoft-unveils-mdash-its-ai-agent-driven-security-platform-and-its-already-spotted-a-host-of-new-windows-flaws
Source snippet
MDASH coordinates over 100 specialized AI agents to detect software vulnerabilities, and it has already discovered 16 previously unknown...
Source: darpa.mil
Title: aixcc results
Link: https://www.darpa.mil/news/2025/aixcc-results
Source snippet
AI Cyber Challenge marks pivotal inflection point for cyber...8 Aug 2025 — Teams' AI-driven systems find, patch real-world cyber vulnera...
Source: arxiv.org
Link: https://arxiv.org/abs/2602.07666
Source: ncsc.gov.uk
Title: why cyber defenders need to be ready for frontier ai
Link: https://www.ncsc.gov.uk/blogs/why-cyber-defenders-need-to-be-ready-for-frontier-ai
Source snippet
National Cyber Security CentreWhy cyber defenders need to be ready for frontier AI30 Mar 2026 — Recent findings from the AI Security Inst...
Source: arxiv.org
Link: https://arxiv.org/abs/2509.14589
Source: GOV.UK
Title: ai security institute frontier ai trends report factsheet
Link: https://www.gov.uk/government/publications/ai-security-institute-frontier-ai-trends-report-factsheet/ai-security-institute-frontier-ai-trends-report-factsheet
Source snippet
Security Institute – Frontier AI Trends report factsheetDec 18, 2025 — It brings together 2 years of government-led testing of leading AI...
Source: GOV.UK
Title: ai safety institute approach to evaluations
Link: https://www.gov.uk/government/publications/ai-safety-institute-approach-to-evaluations/ai-safety-institute-approach-to-evaluations
Source snippet
Safety Institute approach to evaluations9 Feb 2024 — Models that we assess are selected based on estimates of the risk of a system posses...
Source: GOV.UK
Title: No models in AISI ‘s tests
Link: https://www.gov.uk/government/news/inaugural-report-pioneered-by-ai-security-institute-gives-clearest-picture-yet-of-capabilities-of-most-advanced-ai
Source snippet
report pioneered by AI Security Institute gives...Dec 18, 2025 — The analysis also identifies early signs of capabilities linked to auto...
Source: aisi.gov.uk
Title: an evaluation framework for ai misuse in fraud and cybercrime
Link: https://www.aisi.gov.uk/blog/an-evaluation-framework-for-ai-misuse-in-fraud-and-cybercrime
Source snippet
An evaluation framework for AI misuse in fraud and...26 Feb 2026 — We developed a scalable approach to measuring how text-based AI model...
Source: aisi.gov.uk
Link: https://www.aisi.gov.uk/
Source snippet
nced AI and to develop and test risk mitigations.Read more...
Source: aisi.gov.uk
Link: https://www.aisi.gov.uk/category/cyber
Source snippet
GPT-5.5 is one of the strongest models we have tested on our cyber tasks and is the second model to solve...
Source: arxiv.org
Link: https://arxiv.org/html/2602.07666v2
Source: darpa.mil
Title: aixcc challenge 89
Link: https://www.darpa.mil/news/podcast/aixcc-challenge-89
Source snippet
AIxCC: AI Cyber Challenge | Ep 89Sep 22, 2025 — Identifying and patching vulnerabilities at speed and scale. The AI Cyber Challenge, AIxC...
Source: darpa.mil
Link: https://www.darpa.mil/research/programs/cyber-grand-challenge
Source snippet
CGC: Cyber Grand ChallengeA competition to create automatic defensive systems capable of reasoning about flaws, formulating patches and d...
Source: assets.publishing.service.gov.uk
Link: https://assets.publishing.service.gov.uk/media/65395abae6c968000daa9b25/frontier-ai-capabilities-risks-report.pdf
Source snippet
and risks from frontier AIBy contrast, autonomous38 AI agents39 can take long sequences of actions in pursuit of a goal, without requirin...
Source: internationalaisafetyreport.org
Title: international ai safety report 2026
Link: https://internationalaisafetyreport.org/publication/international-ai-safety-report-2026
Source snippet
International AI Safety ReportInternational AI Safety Report 2026Feb 3, 2026 — This Report assesses what general-purpose AI systems can d...
Source: frontiermodelforum.org
Title: managing advanced cyber risks in frontier ai frameworks
Link: https://www.frontiermodelforum.org/technical-reports/managing-advanced-cyber-risks-in-frontier-ai-frameworks/
Source snippet
Frontier Model ForumManaging Advanced Cyber Risks in Frontier AI Frameworks13 Feb 2026 — Frontier AI offers significant promise for cyber...
Source: unit42.paloaltonetworks.com
Title: Unit 42Frontier AI and the Future of Defense: Your Top Questions
Link: https://unit42.paloaltonetworks.com/frontier-ai-top-questions-answered/
Source snippet
April 23, 2026 — 23 Apr 2026 — Unlike LLMs used for basic content generation, frontier models can autonomously identify software vulnerab...

Published: April 23, 2026
Source: aicyberchallenge.com
Link: https://aicyberchallenge.com/
Source snippet
AI Cyber ChallengeAI Cyber ChallengeDARPA's Artificial Intelligence Cyber Challenge (AIxCC), in collaboration with ARPA-H, brings togethe...
Source: theguardian.com
Link: https://www.theguardian.com/technology/2026/may/18/anthropic-ai-claude-mythos-cyber-financial-stability-board-fsb
Source snippet
Mythos possesses advanced capabilities in identifying previously unknown flaws in IT systems, potentially exploitable by hackers. Due to...
Source: arpa-h.gov
Link: https://arpa-h.gov/news-and-events/arpa-h-darpa-challenge-showcases-ais-power-secure-americas-health-care
Source snippet
AI Cyber Challenge showcases AI's Power to secure...4 Sept 2025 — Teams' AI-driven systems find and patch real-world cyber vulnerabiliti...

Additional References

Source: linkedin.com
Link: [https://www.linkedin.com/posts/ai-security-institute_ai-cyber-capabilities-are-improving-rapidly-activity-7435339530121125889-DF3](https://www.linkedin.com/posts/ai-security-institute_ai-cyber-capabilities-are-improving-rapidly-activity-7435339530121125889-DF3)
Source snippet
AI Cyber Capabilities Outpacing EvaluationsAI cyber capabilities are improving rapidly, and new evidence suggests we may be underestimati...
Source: nationalcioreview.com
Link: https://nationalcioreview.com/articles-insights/extra-bytes/ai-just-surpassed-every-cybersecurity-benchmark-experts-were-tracking/
Source snippet
AI Just Surpassed Every Cybersecurity Benchmark Experts...3 days ago — The findings focus on how effectively AI systems can independentl...
Source: trailofbits.com
Link: https://trailofbits.com/buttercup/
Source snippet
ButtercupIn-depth analysis of DARPA's groundbreaking AI Cyber Challenge, exploring how autonomous systems are revolutionizing vulnerabili...
Source: linkedin.com
Link: https://www.linkedin.com/posts/bobcarver_cybersecurity-ai-vulnerbilitymanagement-activity-7360290498483294208-GH-x
Source snippet
DARPA announces AI Cyber Challenge winners, touts AI...DARPA touts value of AI-powered vulnerability detection as it announces competiti...
Source: linkedin.com
Link: https://www.linkedin.com/posts/loravaughn_aisecurity-activity-7458501297634234370-xGdS
Source snippet
Frontier AI models clear 32-step cyber attack chainTwo frontier AI models just cleared a 32-step cyber attack chain. End to end. Anthropi...
Source: linkedin.com
Link: https://www.linkedin.com/posts/simon-ngoy-1a9371103_cybersecurity-redteam-ai-activity-7456344363854393345-JvUV
Source: reuters.com
Link: https://www.reuters.com/legal/litigation/uk-firms-should-take-steps-limit-risks-frontier-ai-models-uk-says-2026-05-15/
Source snippet
These models reportedly possess cyber capabilities that surpass those of skilled human practitioners in speed, scale, and cost-effectiven...
Source: inspect.aisi.org.uk
Link: https://inspect.aisi.org.uk/evals/
Source snippet
EvalsAssesses whether AI agents might engage in harmful activities by testing their responses to malicious prompts in areas like cybercri...
Source: openssf.org
Link: https://openssf.org/podcast/2026/02/09/whats-in-the-soss-podcast-51-s3e3-aixcc-part-1-from-skepticism-to-success-the-ai-cyber-challenge-aixcc-with-andrew-carney/
Source snippet
AIxCC Part 1: From Skepticism to Success | AI Cyber...9 Feb 2026 — Andrew Carney of DARPA and ARPA-H discusses the AI Cyber Challenge (A...
Source: linkedin.com
Link: https://www.linkedin.com/posts/mahmoud-g-8290832b0_cyber-agents-report-by-uk-aisi-was-uploaded-activity-7439451043769270272-FoEw
Source snippet
UK AISI Releases Cyber Agent Evaluation RangesWe designed these ranges to fill gaps in cyber capability evaluation around domain realism...