Claude Code sandbox bypass exposed agent data-risk gap

Breaking

CIO CISOBoardAnthropicClaude CodeAI AgentsCybersecurityPrompt InjectionDeveloper ToolsAI GovernanceVendor Risk

Claude Code sandbox bypass exposed agent data-risk gap

Joachim Høgby

20. mai 202620. mai 20265 min lesingKilde: SecurityWeek

Del

LinkedIn X Facebook E-post WhatsApp Telegram

Anthropic has patched a weakness in Claude Code that security researcher Aonan Guan says could bypass the tool’s network sandbox and send data to servers that policy was supposed to block. SecurityWeek published the story Wednesday and included Anthropic’s response.

This is not a routine developer patch note. Claude Code is used in environments where an AI coding agent can see source code, terminal context, GitHub data, tokens and local configuration. When the sandbox is treated as a security boundary, a bypass becomes a governance issue, not just a bug.

Guan describes the issue as a SOCKS5 hostname null-byte injection. The short version: a user could set policy to allow traffic only to a wildcard such as *.google.com. A process inside the sandbox could send a hostname like attacker-host.com\x00.google.com. The filter saw the suffix and approved the connection as if it were going to Google. The operating system resolver stopped at the null byte and connected to attacker-host.com instead.

One layer interpreted the string for policy. Another layer interpreted the same bytes for the actual network call. Parser gaps like this are old security failures in new clothing. The difference here is that the failure sits inside a tool many enterprises are now treating as an operational coding and automation agent.

Guan writes that the Claude Code network sandbox became generally available on October 20, 2025, and that every release from 2.0.24 through 2.1.89 was vulnerable to at least one of two bypasses. The first, CVE-2025-66479, involved allowedDomains: [] being interpreted as “allow everything” rather than “block everything”. The second is the null-byte issue disclosed this week.

He criticizes Anthropic for silent fixes, no clear Claude Code advisory, no meaningful changelog entry and no CVE for the new issue. SecurityWeek reports that Anthropic says its own security team had identified and fixed the issue before Guan submitted it through HackerOne. According to Anthropic, the fix was committed to sandbox-runtime on March 27 and shipped in Claude Code 2.1.88 on March 31, before the HackerOne report on April 3. Guan writes that the issue was fixed in Claude Code 2.1.90 on April 1.

For customers, the exact version dispute is not the main point. The main point is that a security boundary designed to control egress from an AI agent could be bypassed, and users relying on that boundary did not necessarily receive a clear operational warning.

The risk becomes sharper when combined with prompt injection. Guan points to a previous method he calls “Comment and Control”, where hidden instructions in GitHub comments, issue bodies or pull-request titles can make coding agents take actions. If the agent can also bypass network policy, it may exfiltrate environment variables, API keys, GitHub tokens, source code or infrastructure data.

That matters directly for enterprise leaders. Many companies are piloting coding agents on machines and CI/CD runners that already have access to internal repositories, cloud credentials, secrets and deployment systems. If such an agent reads an external issue, a dependency README or a pull-request comment, prompt injection can become an operational attack chain, not a lab demo.

The leadership lesson is simple: an AI agent sandbox cannot be treated as a black box. It must be governed like other privileged production tooling. That means version requirements, egress policy, secrets management, logging, SIEM integration, incident notification and clear requirements for how the vendor communicates security fixes that affect the control boundary.

CIOs and CISOs should ask four questions now. Which coding agents can access source code, terminals, cloud credentials or GitHub/GitLab tokens? Do they run with local or CI network reach into internal systems? Are egress logs detailed enough to show raw SOCKS5 or proxy traffic, not just ordinary HTTP calls? And are vendor security advisories tied into internal risk review, or are teams discovering critical agent failures through media reports and researcher blogs?

Boards do not need a technical lecture on null bytes. They need one sentence: when AI agents get developer privileges, sandboxing, network boundaries and secrets become part of the company’s control environment. It is not enough to ask whether the model is safe. Leaders must ask what the agent can read, what it can execute, where it can connect, and how failures are disclosed.

The Anthropic case is also a vendor-risk reminder. Frontier labs are moving fast, and coding agents are moving from demos into real workflows. The security boundary is often a stack: a CLI, a local proxy, a third-party library, an operating-system sandbox and policy in configuration files. A weak interpretation in one layer can open the whole chain.

The right enterprise response is not to ban every coding agent. It is to professionalize deployment. Coding agents should run in constrained environments, with short-lived credentials, least privilege, explicit outbound policy, tested deny rules and contractual expectations that security fixes affecting control boundaries are communicated as security events.

This is the new operating reality for AI in production. The agent is not only a productivity tool. It is a process with access, network reach and permission to act. It must be governed like a process with risk.

Sources and media

Primary source: SecurityWeek, “Anthropic Silently Patches Claude Code Sandbox Bypass”, published May 20, 2026 at 9:00 AM ET. https://www.securityweek.com/anthropic-silently-patches-claude-code-sandbox-bypass/
Researcher technical write-up: Aonan Guan, “Second Time, Same Sandbox: Another Anthropic Claude Code Network Sandbox Bypass Enables Data Exfiltration”, published May 20, 2026. https://oddguan.com/blog/second-time-same-sandbox-anthropic-claude-code-network-allowlist-bypass-data-exfiltration/
Related background: SecurityWeek on “Comment and Control” prompt injection against coding agents. https://www.securityweek.com/claude-code-gemini-cli-github-copilot-agents-vulnerable-to-prompt-injection-via-comments/
Thumbnail: GPT/OpenAI Image 2 / hogby.ai.

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.

Relaterte saker

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Breaking

AI-modellerAnthropicClaude

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Claude Opus 5 flytter Anthropic-kampen fra ren intelligens til styrbar kost, fart og sikkerhet i agentarbeid. Det er en tydelig CIO-sak, ikke bare en modellnyhet.

24. juli 20265 min lesing

Anthropic

Åpne saken

CIOCISOCTO

GitHub ruller Claude Opus 5 inn i Copilot for agentisk koding

Claude Opus 5 er tilgjengelig i GitHub Copilot for Pro+, Max, Business og Enterprise. GitHub fremhever agentiske kodeflyter, egenverifisering og strengere cyber-sperrer. For IT-ledere blir modellvalg i Copilot et spørsmål om styring, kostnad og sikkerhet – ikke bare autocomplete.

24. juli 20265 min lesing

GitHub

Åpne saken

AI-modellerGoogle AIGemini

Google gjør Gemini Flash raskere for agentarbeid

Google lanserer Gemini 3.6 Flash og 3.5 Flash-Lite med tydeligere fokus på hastighet, token-effektivitet og produksjonsklare AI-agenter.

24. juli 20264 min lesing

Google AI

Åpne saken