Detecting AI-Automated Container Escapes with Runtime Monitoring

Detecting AI-Automated Container Escapes with Runtime Monitoring

The Problem

Traditional container intrusion detection is calibrated for human attackers: a human penetration tester exploring a container takes minutes to hours, manually executing commands, pausing to read output, and iterating on findings. Existing Falco rules and anomaly detection thresholds assume this pacing — alert on nsenter being called at all, flag a single write to /proc/sysrq-trigger, treat any execution of capsh as suspicious. LLM-automated exploitation is categorically different, and those rules miss the most dangerous attack pattern emerging from the current generation of frontier models.

The UK AI Security Institute’s SandboxEscapeBench measured this directly: Claude Opus 4.5 and GPT-5 escape containers in approximately 50% of attempts. The escape attempts do not look like a human pentester probing the environment — they look like a burst of systematic operations executed in a tight loop with no pauses between steps. An LLM agent executes 50–100 capability enumeration operations per second, attempts 10 or more exploit chains in the time a human attacker would be reading a man page, and produces no “thinking pauses” in the syscall trace. The AI does not need to understand what it is reading before it acts on it.

Three specific LLM attack signatures need dedicated detection rules that existing tooling does not cover:

  1. Mass /proc enumeration within a short time window. An LLM agent systematically reads every file under /proc/self/ and /proc/[pid]/ to enumerate capabilities, environment variables, open file descriptors, namespace memberships, and mount points. A human attacker reads a handful of targeted files. An AI agent reads all of them, often in alphabetical order, within seconds.

  2. Rapid sequential execution of known escape-prerequisite binaries. Tools like nsenter, unshare, capsh, findmnt, and lsns each appear in specific escape chains. A human attacker tries one or two. An AI agent runs through all of them sequentially — trying each known technique from a training-data-derived list of container escape methods — within a 30-second window.

  3. Write attempts to kernel interfaces from non-privileged processes. Once the AI has enumerated capabilities, it attempts writes to /proc/sysrq-trigger, /sys/kernel/security/, or /proc/sys/kernel/ regardless of whether it has the privileges to succeed. The attempt itself, even when it fails, is a reliable signal that an automated escape chain is executing.

The key insight is that frequency and sequencing are the detectable signatures of AI-driven attacks, not the individual operations. Any single operation in the list above could be legitimate. The pattern of all of them executed within seconds is not.

Threat Model

Three attacker profiles drive the detection requirements for AI-automated container escapes:

LLM agent with initial code execution. The attacker has gained code execution inside a container — through a vulnerable application, a compromised image, or a supply chain compromise — and has deployed or invoked an LLM agent to automate the escape. The agent operates in a tight loop: enumerate capabilities, select an escape technique from its training data, attempt it, observe the result, select the next technique. The loop runs without human supervision. The entire escape attempt, from initial execution to successful host access, can complete in under 60 seconds if the container is misconfigured.

AI-assisted human attacker. The human provides initial access and high-level direction; an LLM tool generates and executes the specific escape commands. This pattern is faster than fully manual exploitation but slower than the fully autonomous agent. The same frequency-based detection rules apply but with slightly more relaxed timing — commands arrive in bursts rather than at machine speed throughout.

Automated offensive tooling with embedded LLM capability. An emerging class of offensive tools — scanners and exploitation frameworks — now incorporates LLM-based reasoning for container escape. These tools scan a container, invoke an LLM to generate environment-specific escape commands, and execute them without human involvement. Detection is identical to the autonomous agent case.

Distinguishing from benign processes requires care. Legitimate monitoring agents (node exporters, Prometheus exporters, cAdvisor) read /proc at high frequency as part of their normal function. Distinguishing them from attack patterns requires process identity filtering: the mass-enumeration signature is a container workload process — not a known monitoring binary — reading across multiple /proc/[pid]/ directories. Similarly, nsenter appears in legitimate container debugging workflows, but a single invocation by an operator is not the same signal as ten invocations in thirty seconds from the same container.

Hardening Configuration

Falco Rule: Mass /proc Enumeration

The AI capability enumeration signature is a process inside a container reading more than a threshold number of files from /proc/self/ or /proc/[pid]/ within a short time window. Falco’s count condition with a sliding window detects this pattern.

- list: proc_enum_ignored_images
  items:
    - prometheus
    - cadvisor
    - node-exporter
    - datadog-agent

- macro: is_monitoring_container
  condition: container.image.repository in (proc_enum_ignored_images)

- rule: AI Mass Proc Enumeration
  desc: >
    A container process read more than 50 files from /proc within 5 seconds.
    This matches the AI agent capability enumeration pattern from SandboxEscapeBench.
    Human attackers read a handful of targeted files; AI agents read exhaustively.
  condition: >
    open_read
    and container
    and (fd.name startswith /proc/self/
         or fd.name glob "/proc/[0-9]*/status"
         or fd.name glob "/proc/[0-9]*/environ"
         or fd.name glob "/proc/[0-9]*/ns/*"
         or fd.name glob "/proc/[0-9]*/fdinfo/*"
         or fd.name glob "/proc/[0-9]*/maps")
    and not is_monitoring_container
    and not (k8s.ns.name in (kube-system, monitoring, observability))
  output: >
    AI-pattern proc enumeration detected
    (file=%fd.name command=%proc.cmdline container=%container.name
     image=%container.image.repository namespace=%k8s.ns.name
     pod=%k8s.pod.name count=%evt.count)
  priority: HIGH
  tags: [container-escape, ai-attack, proc-enumeration]
  threshold:
    field: fd.name
    count: 50
    period: 5

The threshold block requires Falco 0.38+ with the sliding-window count feature. For earlier Falco versions, deploy a supplementary Prometheus rule that counts falco_events_total filtered by the rule name and fires when the rate exceeds the threshold.

Falco Rule: Rapid Escape-Tool Execution

The AI exploit-chaining signature is execution of known escape-relevant binaries in rapid succession from the same container. One call to nsenter is a debugging operation. Three calls in thirty seconds, combined with calls to capsh and lsns in the same window, is an automated escape chain.

- list: escape_relevant_binaries
  items:
    - nsenter
    - unshare
    - capsh
    - findmnt
    - lsns
    - newuidmap
    - newgidmap
    - pivot_root

- macro: exec_escape_binary
  condition: >
    spawned_process
    and container
    and proc.name in (escape_relevant_binaries)
    and not (k8s.ns.name in (kube-system, monitoring))

- rule: AI Escape Tool Burst
  desc: >
    A container executed escape-relevant binaries more than 3 times in 30 seconds.
    A single execution may be legitimate debugging. A burst within 30 seconds
    matches the AI systematic exploit-chaining pattern.
  condition: >
    exec_escape_binary
  output: >
    Escape binary burst detected
    (binary=%proc.name command=%proc.cmdline container=%container.name
     image=%container.image.repository namespace=%k8s.ns.name
     pod=%k8s.pod.name parent=%proc.pname)
  priority: CRITICAL
  tags: [container-escape, ai-attack, escape-chain]
  threshold:
    field: proc.name
    count: 3
    period: 30
    scope: container.id

The scope: container.id parameter constrains the count to a single container instance rather than all containers in a namespace, preventing a large deployment from diluting the per-container signal.

eBPF Detection: Kernel Interface Write Attempts

Tetragon provides kernel-level detection with enforcement capability. This TracingPolicy detects write attempts to kernel interfaces that indicate an escape attempt is in progress, regardless of whether the attempt succeeds. The AI agent will attempt these writes even when it lacks the capability to complete them — the attempt itself is the signal.

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: ai-escape-kernel-writes
  namespace: kube-system
spec:
  kprobes:
    - call: "security_file_open"
      syscall: false
      args:
        - index: 0
          type: "file"
        - index: 1
          type: "int"
      selectors:
        - matchArgs:
            - index: 0
              operator: "Prefix"
              values:
                - "/proc/sysrq-trigger"
                - "/sys/kernel/security/"
                - "/proc/sys/kernel/"
                - "/sys/fs/cgroup/release_agent"
                - "/sys/fs/cgroup/notify_on_release"
          matchArgs:
            - index: 1
              operator: "Mask"
              values:
                - "2"
          matchNamespaces:
            - namespace: Mnt
              operator: NotIn
              values:
                - "host_ns"
          matchActions:
            - action: Post
              rateLimit: "1/minute"
              rateLimitScope: "container"

    - call: "security_file_open"
      syscall: false
      args:
        - index: 0
          type: "file"
        - index: 1
          type: "int"
      selectors:
        - matchArgs:
            - index: 0
              operator: "Prefix"
              values:
                - "/proc/sysrq-trigger"
                - "/sys/kernel/security/"
          matchArgs:
            - index: 1
              operator: "Mask"
              values:
                - "2"
          matchNamespaces:
            - namespace: Mnt
              operator: NotIn
              values:
                - "host_ns"
          matchActions:
            - action: Sigkill

The policy first logs the attempt (Post action) for all kernel interface writes from container namespaces. For writes specifically to /proc/sysrq-trigger or /sys/kernel/security/ — the highest-confidence escape signals — it also kills the process immediately. Tetragon requires kernel 5.8+ and adds approximately 2% CPU overhead on high-traffic nodes.

Falco Rule: User Namespace Creation Burst

Rootless container tools create user namespaces as part of normal operation. An AI agent attempting user namespace-based privilege escalation creates them in rapid succession, testing whether CLONE_NEWUSER + CLONE_NEWNS combinations yield elevated capabilities. One creation is normal; two within 10 seconds from the same non-system container is an automated escape attempt.

- macro: is_rootless_tool
  condition: >
    proc.name in (buildah, podman, skopeo, rootlesskit)
    or proc.pname in (buildah, podman, skopeo, rootlesskit)

- rule: AI User Namespace Creation Burst
  desc: >
    A container called unshare(CLONE_NEWUSER) more than twice in 10 seconds.
    Rootless tools create one user namespace; AI escape chains systematically
    test namespace combinations for privilege escalation.
  condition: >
    spawned_process
    and container
    and proc.name = unshare
    and (proc.args contains "--user" or proc.args contains "-U"
         or proc.args contains "CLONE_NEWUSER")
    and not is_rootless_tool
    and not (k8s.ns.name in (kube-system))
  output: >
    User namespace creation burst
    (command=%proc.cmdline container=%container.name
     image=%container.image.repository namespace=%k8s.ns.name
     pod=%k8s.pod.name user=%user.name)
  priority: CRITICAL
  tags: [container-escape, ai-attack, user-namespace]
  threshold:
    field: proc.name
    count: 2
    period: 10
    scope: container.id

Correlate and Alert with Low Latency

For AI-automated attacks, detection latency matters more than for human attacks. A human attacker who triggers an alert and is not immediately stopped has hours or days before completing an escape. An AI agent that triggers an alert and is not immediately stopped can complete the escape within the remaining seconds of the same 60-second window. Alert routing and automated response must be configured for sub-minute end-to-end latency.

falcosidekick:
  config:
    webhook:
      address: "https://response-automation.internal/api/falco-events"
      minimumpriority: "high"
      checkcert: true
      mutualtls: true

    slack:
      webhookurl: "${SLACK_SECURITY_WEBHOOK}"
      minimumpriority: "critical"
      messageformat: >
        *AI escape pattern detected* | pod: `{{ .output_fields.k8s.pod.name }}`
        ns: `{{ .output_fields.k8s.ns.name }}`
        rule: `{{ .rule }}`
        time: `{{ .time }}`

    pagerduty:
      routingkey: "${PAGERDUTY_SECURITY_KEY}"
      minimumpriority: "critical"
actions:
  ai_escape_detection:
    rules:
      - "AI Mass Proc Enumeration"
      - "AI Escape Tool Burst"
      - "AI User Namespace Creation Burst"
    steps:
      - type: kubectl
        command: >
          annotate pod {{ .pod }} -n {{ .namespace }}
          security.io/quarantine-reason="ai-escape-detection"
          security.io/quarantine-time="{{ .time }}"
      - type: kubectl
        command: >
          label pod {{ .pod }} -n {{ .namespace }}
          security.io/network-quarantine=true
      - type: kubectl
        command: "delete pod {{ .pod }} -n {{ .namespace }} --grace-period=0"
  ai_kernel_write:
    rules:
      - "AI Escape Kernel Interface Write"
    steps:
      - type: kubectl
        command: "cordon {{ .hostname }}"
      - type: kubectl
        command: "delete pod {{ .pod }} -n {{ .namespace }} --grace-period=0"
      - type: alert
        severity: critical
        channel: "#security-incidents"
        message: >
          Node {{ .hostname }} cordoned: kernel interface write from container
          {{ .pod }} in {{ .namespace }}. Manual investigation required.

The automated response ladder follows escalating confidence: high-confidence signals (kernel interface writes) trigger immediate node cordon and pod deletion. Lower-confidence signals (proc enumeration, escape binary bursts) trigger pod deletion and network quarantine but leave the node available for scheduling while the team investigates.

Expected Behaviour After Hardening

After deploying the mass-enumeration rule, an LLM agent that reads 100 files from /proc/self/ in 3 seconds triggers the alert before the enumeration is complete. The 50-file-in-5-seconds threshold fires at the halfway point of a typical AI enumeration pass. Falcosidekick delivers the webhook within 2 seconds of the threshold crossing; the automated response handler deletes the pod within 5 seconds of the webhook delivery. Total elapsed time from the start of enumeration to pod deletion: under 10 seconds.

After deploying the escape-tool frequency rule, a rapid sequence of nsenter, capsh, and lsns calls — the pattern observed in SandboxEscapeBench AI escape chains — triggers the CRITICAL alert on the third binary execution. For a typical AI escape chain that runs 8–12 distinct escape tools, the alert fires before the agent has tried a third of its technique list.

After deploying the Tetragon kernel write policy, a write attempt to /proc/sysrq-trigger from a container namespace kills the offending process at the kernel level before the write completes. The Tetragon event is exported to centralized logging within 1 second; the Sigkill action fires in microseconds.

Trade-offs and Operational Considerations

Frequency-based rules introduce false positive risk for any legitimate workload that operates at high frequency within a container. Node exporters, eBPF-based monitoring agents, and container introspection tools all read /proc at rates that could trigger the mass-enumeration rule. The exclusion lists (proc_enum_ignored_images, is_monitoring_container) must be populated with the specific images in your environment after measuring baseline rates. Do not set thresholds without first running the rules in dry-run mode — Falco’s --dry-run flag or a shadow rule that outputs to a separate log — and measuring the false-positive rate against production traffic for at least 72 hours.

Automated kill and isolation responses require careful scoping. A false positive that deletes a production pod during a high-traffic period causes a real outage. The recommended deployment sequence is: alerting only for the first two weeks, automated network quarantine (label-based, via a NetworkPolicy that a controller enforces) in week three, and automated pod deletion in week four if the false-positive rate in the alerting phase is below one per week. For services where a false positive is catastrophic — payment processing, primary databases — do not enable automated deletion; route to human review instead.

Tetragon’s Sigkill enforcement requires kernel 5.8 or later. Verify node kernel versions before deploying enforcement policies:

kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.nodeInfo.kernelVersion}{"\n"}{end}'

Nodes running kernels below 5.8 will reject the TracingPolicy. Deploy monitoring-only Falco rules as a fallback for those nodes until they can be upgraded.

Failure Modes

Falco rules deployed but the Falco DaemonSet not running on all nodes is the most common coverage gap. A DaemonSet that does not tolerate all node taints leaves some nodes unmonitored. An AI agent that can move its workload to an unmonitored node — by manipulating pod scheduling or exploiting a scheduler vulnerability — escapes detection entirely. Verify coverage by checking that the Falco DaemonSet pod count equals the total node count:

kubectl get daemonset falco -n kube-system -o jsonpath='{.status.desiredNumberScheduled}{"\t"}{.status.numberReady}{"\n"}'

Any gap between desired and ready counts requires immediate investigation of node taints and DaemonSet tolerations.

Alerts routed to a Slack channel that is not monitored outside business hours eliminate the detection value for off-hours attacks. AI agents do not respect business hours. A PagerDuty routing key with a 24/7 on-call rotation is required for CRITICAL alerts. Slack is appropriate only for HIGH alerts that can wait for the next business day.

Frequency thresholds set too high to avoid false positives create detection gaps that the AI attack completes within. If the mass-enumeration threshold is set to 200 reads in 5 seconds — to avoid false positives from a specific monitoring tool — an AI agent that completes its enumeration in 150 reads before switching to active exploitation passes through the window undetected. Tune thresholds by measuring the maximum legitimate read rate in your environment, then setting the threshold at 1.5x that rate rather than 4x or 10x. If the gap between the legitimate baseline and the AI attack pattern is too small to allow a useful threshold, that monitoring tool needs to be excluded by process identity rather than by raising the threshold.