Research shows AI agents can break out of container sandboxes
A new study from the University of Oxford and the AI Security Institute reveals that AI agents can exploit common configuration vulnerabilities to escape container sandboxes.
The researchers developed SandboxEscapeBench, a benchmark that places AI models in controlled container environments and tests whether they can retrieve a protected file from the host system. The benchmark includes 18 scenarios spanning three layers: orchestration, runtime, and kernel.
The results are concerning. Frontier models successfully exploited exposed Docker sockets, writable host mounts, and privileged containers. More complex tasks and kernel-level exploits proved more challenging, but basic configuration weaknesses were consistently exploited.
This is directly relevant for anyone running AI agents in production. If you're using containers as a security layer for AI code execution, you should review your configuration immediately. Exposed Docker sockets and writable mounts are low-hanging fruit that AI agents can now demonstrably exploit.
The benchmark is available as open source through AISI's Inspect framework.
📬 Likte du denne?
AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.