Anthropic: AI found more than 10,000 severe vulnerabilities

Breaking

CIO CISOBoardAnthropicClaude MythosProject GlasswingCybersecurityVulnerability ManagementPatch ManagementOpen SourceSoftware Supply ChainAI GovernanceVendor RiskCritical InfrastructureEnterprise AI

Anthropic: AI found more than 10,000 severe vulnerabilities

Joachim Høgby

22. mai 202622. mai 20265 min lesingKilde: Anthropic

Del

LinkedIn X Facebook E-post WhatsApp Telegram

Anthropic says Project Glasswing has already used Claude Mythos Preview to find more than 10,000 high- or critical-severity vulnerabilities across roughly 50 partners. This is not a routine vendor note about another security tool. It is an early measurement of what happens when AI makes vulnerability discovery cheaper, faster and more scalable than the patching system around it.

The most important line in Anthropic’s update is not the 10,000 figure. It is the claim that progress in software security is now limited by how quickly findings can be verified, disclosed and patched, not by how quickly they can be discovered. For Norwegian and European organizations, that means classic patch SLAs, supplier requirements and risk registers are too slow if they still assume vulnerabilities arrive one by one.

Project Glasswing was launched to secure critical software before increasingly capable AI models can be turned against it. The partner set includes organizations that build and maintain infrastructure used across the internet and enterprise technology stacks. When those environments suddenly find hundreds of severe bugs each, the key question is no longer whether AI can find vulnerabilities. The question is who has the capacity to validate, prioritize and ship fixes before attackers run the same workflow.

What Anthropic reports

Anthropic says it and its partners have applied Claude Mythos Preview to systemically important software. After one month, most partners have each found hundreds of high- or critical-severity vulnerabilities. Collectively, the number is above 10,000. Cloudflare is highlighted with 2,000 bugs found across critical-path systems, including 400 high- or critical-severity findings, and a false positive rate the company considers better than human testers.

The open-source numbers are even more concrete. Anthropic says Mythos Preview has scanned more than 1,000 projects that underpin much of the internet and parts of Anthropic’s own infrastructure. The model estimated 23,019 vulnerabilities in total, including 6,202 rated high or critical. Of 1,752 high- or critical-rated findings reviewed by external security firms or Anthropic itself, 90.6 percent were confirmed as valid true positives. 62.4 percent were confirmed as high or critical.

That gives a rough but important signal: even if the model found no further vulnerabilities, Anthropic says it is on track to have surfaced nearly 3,900 high- or critical-severity vulnerabilities in open-source code. That is in addition to the findings reported by Glasswing partners.

The example security leaders should notice is wolfSSL, an open-source cryptography library used across large numbers of products and devices. Mythos Preview found a vulnerability and also constructed an exploit that could allow an attacker to forge certificates. In practice, that could allow a fake banking or email site to look legitimate to an end user. The issue is now patched and has been assigned CVE-2026-5194.

The patch queue becomes the management problem

In many executive teams, vulnerability management is still treated as technical hygiene. This update shows why that is not enough. If AI models can increase finding rates by a factor of ten or more, the management issue becomes actual change capacity. Patch windows, regression testing, supplier notifications, maintenance agreements, incident readiness and risk ownership all have to operate at a different tempo.

Anthropic points to the established coordinated-disclosure convention of 90 days, or roughly 45 days after a patch becomes available. That balance made sense in a world where discovery and exploitation took more time. When AI reduces the cost of finding and testing flaws, those windows become more dangerous. Not because every bug automatically becomes an attack, but because the gap between discovery and remediation becomes more valuable to attackers.

This also strains the open-source ecosystem. Anthropic says some maintainers have asked it to slow down disclosures because they need more time to design patches. For organizations that use open-source components deep inside products, apps and internal systems, this is now a direct supply-chain question. It is not enough to ask whether a supplier has an SBOM. Leaders need to know who actually has the mandate and capacity to respond when AI tools find 50 relevant issues in a week.

What leaders should do now

The first move is to update patch SLAs. Actively exploited and AI-verified vulnerabilities should not be handled like ordinary monthly hygiene tasks. Critical systems need explicit thresholds for rapid testing, rollback and emergency deployment. CIO, CISO and operations need that agreement before the crisis arrives.

The second move is to tighten supplier requirements. Contracts should specify how quickly suppliers must notify customers about AI-discovered findings, how severity is documented, when temporary compensating controls must be applied, and how customers receive evidence that patches have actually been deployed. This matters most for identity, payments, customer data, healthcare, industrial operations and developer platforms.

The third move is to govern internal AI security tooling. When models are used for vulnerability discovery, they need logging, data boundaries, access controls and explicit human approval. There is a large difference between a controlled internal scan and an autonomous agent with broad repository and cloud access without an audit trail.

The fourth move is to build a triage function that can handle volume. Anthropic says generally available models can already find large numbers of software vulnerabilities, even if they do not match Mythos Preview’s most advanced offensive capability. That means more vendors, consultants and attackers will get similar workflows. The quality of triage, not the number of findings, becomes the advantage.

Mythos is not being released broadly yet

Anthropic is unusually direct about the risk. The company says models with cybersecurity skills similar to Mythos Preview will soon be developed by many AI companies, and that no company, including Anthropic, has safeguards strong enough to prevent misuse and potentially severe harm. That is why Mythos-class models have not been released to the public. At the same time, Anthropic is making tools, skills and a scanning harness available to qualifying security teams.

That is a difficult balance. Defenders need capacity before attackers get the same capacity. But the defensive capacity also creates a new operational burden: many more findings, more coordinated disclosures, more patches, more exceptions and more pressure on the same security teams.

For boards, the conclusion is simple. AI security is not only about controlling ChatGPT use and internal data. It is about the software an organization already depends on moving to a new vulnerability tempo. The risk shifts from “can we find the flaws?” to “can we absorb the findings before someone else does?”.

Sources and media

Primary source: Anthropic, “Project Glasswing: An initial update”, published May 22, 2026: https://www.anthropic.com/research/glasswing-initial-update
Source credit: Anthropic. This article relies on Anthropic’s reported figures for Project Glasswing, Claude Mythos Preview, open-source scanning, Cloudflare, Mozilla, wolfSSL and CVE-2026-5194.
Related primary sources linked by Anthropic: Cloudflare on cyber frontier models, Mozilla on Firefox findings, wolfSSL on the Mythos finding and the NVD entry for CVE-2026-5194.
Thumbnail: OpenAI Image 2 / hogby.ai.

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.

Relaterte saker

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Breaking

AI-modellerAnthropicClaude

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Claude Opus 5 flytter Anthropic-kampen fra ren intelligens til styrbar kost, fart og sikkerhet i agentarbeid. Det er en tydelig CIO-sak, ikke bare en modellnyhet.

24. juli 20265 min lesing

Anthropic

Åpne saken

CIOCISOCTO

GitHub ruller Claude Opus 5 inn i Copilot for agentisk koding

Claude Opus 5 er tilgjengelig i GitHub Copilot for Pro+, Max, Business og Enterprise. GitHub fremhever agentiske kodeflyter, egenverifisering og strengere cyber-sperrer. For IT-ledere blir modellvalg i Copilot et spørsmål om styring, kostnad og sikkerhet – ikke bare autocomplete.

24. juli 20265 min lesing

GitHub

Åpne saken

AI-modellerGoogle AIGemini

Google gjør Gemini Flash raskere for agentarbeid

Google lanserer Gemini 3.6 Flash og 3.5 Flash-Lite med tydeligere fokus på hastighet, token-effektivitet og produksjonsklare AI-agenter.

24. juli 20264 min lesing

Google AI

Åpne saken