ContainerSSH Network Isolation: Per-Session NetworkPolicy and Egress Control

ContainerSSH Network Isolation: Per-Session NetworkPolicy and Egress Control

Problem

ContainerSSH replaces traditional SSH bastions with a Kubernetes-native model: each authenticated session spawns a dedicated Pod, giving operators per-session container isolation, audit logging, and the ability to inject custom environments without maintaining long-lived shell servers. The security gains on the host layer are real — there is no persistent SSH daemon accumulating sessions, no shared user home directories, and no risk of one session’s process interfering with another at the OS level.

The network layer tells a different story. Unless you explicitly configure NetworkPolicy, every ContainerSSH session Pod inherits the default Kubernetes networking model: flat, open, and fully routable. A Pod in the ssh-sessions namespace can initiate TCP connections to any other Pod in any namespace, reach any ClusterIP service, query the Kubernetes API server at its well-known address, and send arbitrary DNS queries to kube-dns. This is not a ContainerSSH-specific flaw — it is the intentional default of the Kubernetes network model, which expects operators to layer access control on top.

The consequence for ContainerSSH is that the isolation story you intend to sell — “each user gets a scoped session container” — does not extend to the network plane without deliberate configuration. A user who has been granted SSH access gets a session Pod from which they can, without any additional exploitation, directly connect to your internal PostgreSQL instance, query other users’ session Pods, enumerate internal services via DNS, and in many cluster configurations reach the Kubernetes API server using the Pod’s mounted service account token. A compromised user credential is a network compromise.

This article addresses that gap end-to-end: default-deny egress for the ssh-sessions namespace, per-user label injection via ContainerSSH’s config webhook, targeted egress allowances using standard NetworkPolicy, Cilium CiliumNetworkPolicy for L7 path-level control, DNS policy enforcement, and Hubble-based observability to detect policy violations in real time.

Target systems: Kubernetes 1.27+ with ContainerSSH 0.5+, Cilium 1.15+ as the CNI (standard NetworkPolicy examples also provided for non-Cilium clusters).

Threat Model

1. Compromised session reaching the database tier directly. A ContainerSSH session Pod in ssh-sessions has no network boundary between it and a PostgreSQL instance in the db namespace. An attacker who controls the session — via credential theft, supply chain compromise of the session container image, or a vulnerability in tooling the user runs inside the session — can connect to postgres.db.svc.cluster.local:5432 and attempt authentication. If the database accepts connections from cluster-internal IPs without mutual TLS or application-level credential enforcement, this is a single TCP connection away from data exfiltration.

2. Inter-session lateral movement. Multiple users run concurrent ContainerSSH sessions. Each session Pod is reachable from every other session Pod because all are in the same namespace with no intra-namespace isolation. A user can port-scan the ssh-sessions Pod CIDR, identify other active sessions, and attempt to exploit tooling running inside them. This is especially relevant when sessions run as non-minimal images that include web servers, Jupyter kernels, or debug sidecar processes that bind to ports.

3. Kubernetes API server access from session Pod. Every Pod in a Kubernetes cluster can reach the API server, typically at https://10.96.0.1:443 or via the kubernetes.default.svc.cluster.local FQDN. Kubernetes automatically mounts a service account token at /var/run/secrets/kubernetes.io/serviceaccount/token unless automountServiceAccountToken: false is set. An attacker inside a session Pod can use this token to enumerate cluster resources, escalate privileges if the service account has overly permissive RBAC, or extract secrets from other namespaces. The path from SSH session to cluster owner is direct.

4. DNS exfiltration. Even when all TCP and UDP egress is blocked, DNS queries to kube-dns may remain open because operators often forget to restrict UDP/53 in their NetworkPolicy rules. DNS exfiltration tools encode data in query subdomains — sending sequences of queries like chunk1.secret.attacker.com, chunk2.secret.attacker.com — and receive responses from an attacker-controlled authoritative DNS server. Because kube-dns forwards queries for external FQDNs to upstream resolvers, a session Pod with only DNS egress allowed can exfiltrate data at a rate limited only by the DNS query rate.

Configuration / Implementation

Step 1: Default-deny NetworkPolicy for the ssh-sessions namespace

The foundation is a default-deny policy that blocks all ingress and egress for every Pod in the ssh-sessions namespace. Subsequent policies grant only the specific traffic paths required.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: ssh-sessions
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

podSelector: {} with no matchLabels selects every Pod in the namespace. With both Ingress and Egress listed under policyTypes and no corresponding rules, all traffic is denied. Kubernetes NetworkPolicy is additive — a Pod’s effective policy is the union of all policies that select it — so this default-deny is the baseline from which targeted allowances are granted.

Apply this before ContainerSSH is configured to create session Pods in this namespace. If you apply it after, there will be a window where sessions run without isolation.

Step 2: Injecting per-user labels via the ContainerSSH config webhook

ContainerSSH’s config webhook fires for every incoming connection and allows your backend to return a customised Pod spec for the session. This is the hook point for injecting per-user NetworkPolicy labels based on group membership.

The webhook receives a JSON payload including the authenticated username. Your webhook implementation looks up the user’s group membership (LDAP, OIDC claims, or a local mapping) and returns a Pod spec with labels that encode the user’s access scope:

// Abbreviated Go webhook handler — full implementation varies by auth backend
func handleConfig(w http.ResponseWriter, r *http.Request) {
    var req containerssh.ConfigRequest
    json.NewDecoder(r.Body).Decode(&req)

    username := req.Username
    groups, _ := lookupGroups(username) // your auth backend

    labels := map[string]string{
        "app":          "ssh-session",
        "session-user": sanitizeLabel(username),
    }

    // Map group membership to allowed egress targets
    for _, g := range groups {
        switch g {
        case "db-team":
            labels["ssh-target"] = "db-team"
        case "api-team":
            labels["ssh-target"] = "api-team"
        case "readonly":
            labels["ssh-target"] = "readonly"
        }
    }

    resp := containerssh.ConfigResponse{
        Config: containerssh.AppConfig{
            Backend: "kubernetes",
            Kubernetes: containerssh.KubernetesConfig{
                Pod: containerssh.PodConfig{
                    Metadata: metav1.ObjectMeta{
                        Labels: labels,
                    },
                    Spec: corev1.PodSpec{
                        AutomountServiceAccountToken: boolPtr(false),
                        DNSPolicy:                    corev1.DNSNone,
                        DNSConfig: &corev1.PodDNSConfig{
                            Nameservers: []string{"10.100.0.10"},
                            Searches:    []string{},
                            Options: []corev1.PodDNSConfigOption{
                                {Name: "ndots", Value: strPtr("1")},
                            },
                        },
                    },
                },
            },
        },
    }
    json.NewEncoder(w).Encode(resp)
}

Key points in this webhook response:

  • AutomountServiceAccountToken: false prevents the default token mount, eliminating the API server credential threat
  • DNSPolicy: None with an explicit Nameservers list routes DNS through a controlled resolver (covered in Step 6)
  • Labels like ssh-target: db-team are what the per-scope NetworkPolicy resources will match on

Step 3: Per-scope egress allowances using standard NetworkPolicy

With labels injected by the webhook, NetworkPolicy resources can grant targeted egress to specific services. Each policy selects session Pods by label and allows egress only to the named destination.

Egress to PostgreSQL for db-team sessions:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-egress-db-team-postgres
  namespace: ssh-sessions
spec:
  podSelector:
    matchLabels:
      ssh-target: db-team
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: db
          podSelector:
            matchLabels:
              app: postgres
      ports:
        - protocol: TCP
          port: 5432

This grants egress on TCP/5432 to Pods with app: postgres in the db namespace — but only for session Pods labelled ssh-target: db-team. Sessions without that label remain under the default-deny policy.

Egress to an internal API for api-team sessions:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-egress-api-team-internal-api
  namespace: ssh-sessions
spec:
  podSelector:
    matchLabels:
      ssh-target: api-team
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: platform
          podSelector:
            matchLabels:
              app: internal-api
      ports:
        - protocol: TCP
          port: 8080

Step 4: Blocking inter-session traffic

The default-deny policy already blocks inbound connections to session Pods from outside the namespace. However, Pod-to-Pod traffic within the same namespace requires an explicit rule to block when podSelector: {} could inadvertently match the source. Confirm the default-deny Ingress rule covers intra-namespace traffic — it does, because a Pod in ssh-sessions initiating a connection to another Pod in ssh-sessions is subject to the destination Pod’s ingress policy.

To be explicit and defence-in-depth, add an egress block specifically targeting the session Pod CIDR using a namespaceSelector that excludes the ssh-sessions namespace itself:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: block-inter-session-egress
  namespace: ssh-sessions
spec:
  podSelector:
    matchLabels:
      app: ssh-session
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchExpressions:
              - key: kubernetes.io/metadata.name
                operator: NotIn
                values:
                  - ssh-sessions

Combined with the default-deny, this explicitly ensures that even if a future policy change inadvertently opens egress broadly, sessions cannot reach other session Pods.

Step 5: Blocking egress to the Kubernetes API server

The Kubernetes API server ClusterIP is in the default namespace as the kubernetes service. Block session Pod egress to it explicitly:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: block-egress-kube-apiserver
  namespace: ssh-sessions
spec:
  podSelector:
    matchLabels:
      app: ssh-session
  policyTypes:
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
            except:
              - 10.96.0.1/32

Replace 10.96.0.1 with your cluster’s API server ClusterIP if it differs (kubectl get svc kubernetes -n default -o jsonpath='{.spec.clusterIP}'). Note: this policy interacts with the default-deny — it only matters if another policy opens broad IP egress. The safest model is to never open IP-block egress and only use podSelector/namespaceSelector rules. The explicit block here acts as a backstop.

Step 6: Cilium CiliumNetworkPolicy for L7 control

Standard NetworkPolicy operates at L3/L4 — IP addresses and ports. Cilium’s CiliumNetworkPolicy extends this to HTTP methods, paths, and DNS FQDNs. For ContainerSSH sessions that need to reach an internal API, L7 policy lets you enforce not just “can reach the API on port 8080” but “can only call GET /status and POST /jobs.”

L7 HTTP egress policy for api-team sessions:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-egress-api-team-l7
  namespace: ssh-sessions
spec:
  endpointSelector:
    matchLabels:
      ssh-target: api-team
  egress:
    - toEndpoints:
        - matchLabels:
            k8s:app: internal-api
            k8s:io.kubernetes.pod.namespace: platform
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: GET
                path: /status
              - method: GET
                path: /jobs(/[a-zA-Z0-9_-]+)?
              - method: POST
                path: /jobs

This allows GET /status, GET /jobs, GET /jobs/<id>, and POST /jobs. Any other method or path — including DELETE /jobs/123, GET /admin, or PUT /jobs/123 — is denied at the Envoy proxy layer without reaching the backend application. The backend never sees the denied request.

DNS egress policy restricting allowed FQDNs:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-egress-dns-controlled
  namespace: ssh-sessions
spec:
  endpointSelector:
    matchLabels:
      app: ssh-session
  egress:
    - toEndpoints:
        - matchLabels:
            k8s:io.kubernetes.pod.namespace: kube-system
            k8s:k8s-app: kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: UDP
            - port: "53"
              protocol: TCP
          rules:
            dns:
              - matchName: "postgres.db.svc.cluster.local"
              - matchName: "internal-api.platform.svc.cluster.local"
              - matchPattern: "*.db.svc.cluster.local"

Only the explicitly listed FQDNs and patterns resolve successfully. Queries for any other domain — including attacker-controlled exfiltration domains — are blocked. matchPattern supports * as a glob for subdomain matching. Do not use matchPattern: "*" as an allowance — that is equivalent to no DNS restriction.

Step 7: Custom DNS resolver for session Pods

Setting dnsPolicy: None in the Pod spec (injected by the webhook in Step 2) routes all DNS queries through a controlled resolver rather than the default kube-dns. Deploy a dedicated DNS resolver for session Pods that enforces additional policy:

apiVersion: v1
kind: ConfigMap
metadata:
  name: session-coredns-config
  namespace: dns-control
data:
  Corefile: |
    . {
        errors
        health
        ready
        # Allow only internal cluster domains
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
        }
        # Block all external resolution — no forward rule
        # Any query not matching kubernetes plugin returns NXDOMAIN
        log
        cache 30
    }

Deploy this as a CoreDNS instance in a dedicated dns-control namespace and restrict its accessibility to ssh-sessions Pods only. The absence of a forward directive means external domains return NXDOMAIN regardless of what the upstream resolver would return. This eliminates DNS exfiltration via external FQDN resolution.

Step 8: Hubble observability for session traffic

Hubble is Cilium’s network observability layer. When deployed, it captures flow records for every connection attempt — including policy-denied drops. For ContainerSSH, this provides real-time visibility into what sessions are attempting to reach.

Enable Hubble in your Cilium Helm values:

hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: true
  metrics:
    enabled:
      - dns:query;ignoreAAAA
      - drop
      - tcp
      - flow
      - icmp
      - http

Query live session traffic with Hubble CLI filtered to the ssh-sessions namespace:

hubble observe \
  --namespace ssh-sessions \
  --type drop \
  --follow

For automated alerting, use Hubble’s Prometheus metrics. The hubble_drop_total metric by reason and direction fires when policy denies a connection:

# Prometheus alerting rule — alert on unexpected drops from session Pods
groups:
  - name: containerssh-network
    rules:
      - alert: SessionPodUnexpectedEgressDrop
        expr: |
          rate(hubble_drop_total{
            namespace="ssh-sessions",
            direction="EGRESS",
            reason!="POLICY_DENIED"
          }[5m]) > 0
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Session Pod egress dropped for unexpected reason"
          description: "Non-policy drop in ssh-sessions namespace: {{ $labels.reason }}"

      - alert: SessionPodHighDropRate
        expr: |
          rate(hubble_drop_total{
            namespace="ssh-sessions",
            direction="EGRESS",
            reason="POLICY_DENIED"
          }[5m]) > 10
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Session Pod making many blocked egress attempts"
          description: "Session Pod {{ $labels.source }} dropped > 10/s — possible scanning"

A high rate of POLICY_DENIED drops from a single session Pod is a strong indicator of active scanning or exfiltration attempts. Correlate the source Pod name with the ContainerSSH audit log to identify the authenticated user.

Expected Behaviour

The table below maps session Pod actions to the policies described above and the expected Hubble observation:

Session Pod Action Applicable Policy Outcome Hubble Observation
Connect to postgres.db.svc.cluster.local:5432 (user in db-team) allow-egress-db-team-postgres Allowed; TCP session established Flow visible, verdict FORWARDED
Connect to postgres.db.svc.cluster.local:5432 (user not in db-team) default-deny-all Blocked; TCP RST or timeout POLICY_DENIED drop logged
Connect to another session Pod in ssh-sessions (any user) default-deny-all ingress rule on destination Blocked at destination ingress POLICY_DENIED drop on destination Pod ingress
GET /admin on internal-api:8080 (api-team user) allow-egress-api-team-l7 Cilium L7 rule Blocked by Envoy at L7; HTTP 403 returned L7 policy denied flow logged in Hubble
GET /status on internal-api:8080 (api-team user) allow-egress-api-team-l7 permits GET /status Allowed; response forwarded Flow visible, verdict FORWARDED
Connect to Kubernetes API 10.96.0.1:443 block-egress-kube-apiserver ipBlock except Blocked; no token usable POLICY_DENIED drop, alert fires if rate > threshold
DNS query for attacker.com TXT record Cilium DNS policy allow-egress-dns-controlled Blocked; NXDOMAIN returned DNS policy denied logged by Cilium
DNS query for postgres.db.svc.cluster.local (any session Pod) Cilium DNS policy permits this FQDN Allowed; resolves to ClusterIP DNS flow forwarded

Trade-offs

Approach Benefit Cost / Risk
Per-label NetworkPolicy (one policy per access scope) Fine-grained per-user egress control; additive and auditable; maps directly to RBAC groups Label management overhead — a label typo in the webhook silently applies wrong policy; requires maintaining policy objects for every scope
Cilium L7 CiliumNetworkPolicy vs. standard NetworkPolicy Path and method-level enforcement; prevents privilege escalation within an allowed service Cilium-specific; not portable to other CNIs; L7 inspection adds latency (~1ms per request); CVE-2026-33726-class bypasses possible if Cilium misconfigured
Custom DNS resolver vs. cluster kube-dns Complete control over resolvable FQDNs; eliminates DNS exfiltration path Breaks service discovery for services not explicitly listed; increases operational burden when adding new allowed services; CoreDNS misconfiguration can break all session connectivity
automountServiceAccountToken: false in webhook Removes the API server credential from the Pod entirely Breaks any tooling inside the session container that uses the default token for cluster access; must be accounted for in session image design
Hubble network observability Real-time visibility into policy violations; enables anomaly detection Hubble relay adds memory and CPU overhead; flow data is sensitive and must itself be access-controlled; without Hubble, policy violations are invisible
dnsPolicy: None with explicit nameservers Eliminates dependence on cluster DNS; enables full DNS control DNS resolution fails entirely if the custom resolver is unavailable; session Pods need network path to the DNS control namespace, which must be explicitly allowed

Failure Modes

Failure Mode Root Cause Impact Detection and Mitigation
NetworkPolicy not applied before Pod starts Kubernetes creates the session Pod and schedules it before the per-session policy is propagated by the CNI Window (typically <100ms but up to several seconds under load) where session Pod has unrestricted network access Mitigate by creating a validating admission webhook that blocks Pod creation in ssh-sessions unless required labels are present; monitor Pod creation-to-policy-applied latency via CNI metrics
Label typo in webhook response Config webhook returns ssh-target: db_team (underscore) instead of ssh-target: db-team (hyphen) Session Pod matches no scope policy; default-deny applies; session appears to have no network access Validate all label values in the webhook against an allowlist before returning; emit a webhook error metric when an unrecognised scope is returned
Cilium DNS policy missing a required FQDN A new internal service is added but the DNS CiliumNetworkPolicy is not updated All DNS queries for the new service return NXDOMAIN from Cilium’s DNS proxy; session users cannot reach the service and see confusing resolution errors Maintain DNS policy as code alongside service definitions; use integration tests that verify DNS resolution from a test Pod in ssh-sessions after policy changes
Cilium L7 policy bypass (CVE-2026-33726 class) Per-endpoint routing + BPF host routing disabled; traffic to same-node backends bypasses Envoy L7 HTTP path restrictions silently not enforced; session Pod can call any API endpoint on same-node backends Verify Cilium config: enable-endpoint-routes and tunnel mode; upgrade to patched Cilium version; use active probing tests that attempt blocked paths and assert 403
Hubble not deployed or relay unavailable Hubble disabled in Cilium Helm values or relay Pod crash-looping Policy drops are invisible; no alerting on scanning or exfiltration attempts Include Hubble relay health in cluster readiness checks; alert on cilium_endpoint_state and relay pod readiness; without Hubble, fall back to Cilium’s cilium monitor on individual nodes
automountServiceAccountToken not set to false Webhook omits the field; ContainerSSH default Pod spec mounts the default service account token Session Pod carries API server credentials; token usable for cluster enumeration even if network policy blocks direct API server access from new sessions Add a MutatingAdmissionWebhook or OPA/Gatekeeper policy that enforces automountServiceAccountToken: false on all Pods in ssh-sessions namespace as a backstop to the ContainerSSH webhook