Hardening the ContainerSSH Config and Auth Webhook: Identity Integration and Request Security

Hardening the ContainerSSH Config and Auth Webhook: Identity Integration and Request Security

The Webhook as the Security Perimeter

ContainerSSH’s design is deliberately minimal: it terminates SSH connections, then outsources every meaningful security decision to an HTTP webhook. The webhook answers two questions — should this user be allowed in, and what container should they get? That design makes the webhook the trust boundary of the entire system. ContainerSSH itself enforces nothing beyond TLS and the SSH protocol; all identity and policy logic lives in the webhook.

The consequence is that the webhook’s security posture determines ContainerSSH’s security posture. A webhook that returns {"success": true} for every request is, functionally, an unauthenticated bastion. A webhook that accepts requests from any caller and returns a container spec with hostPath mounts and root privileges is worse than having no access control at all.

ContainerSSH 0.5 and 0.6 support two webhook endpoints: an authentication webhook (validates credentials, returns success/failure) and a configuration webhook (returns the container spec for the session). Both can be the same service. Both have the same threat surface. This article covers securing both, integrating them with real identity providers, and making them resilient without trading away security for availability.

The architecture described here fits naturally into the zero-trust model from Zero-Trust Architecture Principles: every SSH session is a fresh authorization decision, and the webhook is the policy enforcement point.

Threat Model

Before hardening, it is worth being explicit about what can go wrong.

Threat 1: Direct webhook invocation. The auth webhook is an HTTP endpoint. If it accepts requests from any source, an attacker who discovers the URL can call it directly — bypassing ContainerSSH entirely — and receive the container configuration for any username they supply. There is no SSH handshake, no client authentication, no connection rate limiting. The attacker gets container specs at the cost of an HTTP request.

Threat 2: LDAP injection via SSH username. ContainerSSH passes the SSH username verbatim to the webhook. If the webhook uses that username in an LDAP search filter without sanitisation — (&(uid=<username>)(memberOf=...)) — a username like *)(uid=*))(|(uid=* can collapse the filter into a condition that matches every user in the directory. The attacker doesn’t need to know valid credentials; they just need to craft a username that manipulates the filter.

Threat 3: Overly permissive container configuration. The configuration webhook controls everything about the container the user lands in: image, user, capabilities, volume mounts, network namespace. A webhook that returns an identical privileged spec for all users — because the developer copied a working example from the docs without adjusting it — provides session isolation but no resource or privilege isolation. Users get root in containers with host filesystem access.

Threat 4: Webhook compromise. If an attacker gains code execution on the webhook process or modifies the webhook’s backing policy store (a database, a file, a configuration map), they can alter the response for any user. Unlike a compromised firewall rule or a misconfigured RBAC binding, a webhook compromise is invisible to standard audit tooling — ContainerSSH logs that access was granted, not that the granting decision was fraudulent.

Threat 5: Availability as a security property. ContainerSSH fails closed — if the webhook is unreachable, SSH connections are rejected. This is the safe default, but it means webhook availability directly affects operational continuity. An attacker who can take down the webhook can deny SSH access to the entire fleet. HA design is therefore a security requirement, not just an ops requirement.

ContainerSSH Webhook Protocol

Request Structure

Both the auth webhook and the config webhook receive POST requests with a JSON body. The auth webhook receives:

{
  "username": "alice",
  "remoteAddress": "192.0.2.45:51234",
  "connectionId": "3f8a7c2e-1b4d-4e9f-a2c1-7d5e8f3a9b0c",
  "passwordAuthRequest": {
    "password": "hunter2"
  }
}

For public key authentication, passwordAuthRequest is replaced with:

"publicKeyAuthRequest": {
  "publicKey": "ssh-ed25519 AAAA..."
}

The connectionId is a UUID generated by ContainerSSH per SSH connection. It appears in ContainerSSH’s audit log, making it the correlation key between the SSH session, the webhook decision, and the container lifecycle.

The auth webhook response is:

{
  "success": true,
  "error": ""
}

On failure, set success: false and populate error with a message that will appear in ContainerSSH’s logs (not surfaced to the SSH client).

The config webhook receives the same request envelope (minus the credential fields) and responds with a partial or complete container specification. ContainerSSH merges the response into its base configuration. A minimal response targeting the Kubernetes backend:

{
  "config": {
    "backend": "kubernetes",
    "kubernetes": {
      "pod": {
        "spec": {
          "containers": [
            {
              "name": "shell",
              "image": "ghcr.io/containerssh/containerssh-guest-image:latest",
              "securityContext": {
                "runAsNonRoot": true,
                "runAsUser": 1000,
                "readOnlyRootFilesystem": false,
                "allowPrivilegeEscalation": false,
                "capabilities": {
                  "drop": ["ALL"]
                }
              }
            }
          ]
        }
      }
    }
  }
}

An empty config object in the response causes ContainerSSH to use its built-in defaults — which may be permissive depending on how the base configuration is written. Always return an explicit spec.

ContainerSSH Configuration

In config.yaml, the webhook endpoints are declared under auth and configserver:

auth:
  webhook:
    url: https://webhook.internal:8443/auth
    timeout: 10s
    cacert: /etc/containerssh/webhook-ca.crt
    clientcert: /etc/containerssh/containerssh-client.crt
    clientkey: /etc/containerssh/containerssh-client.key

configserver:
  webhook:
    url: https://webhook.internal:8443/config
    timeout: 10s
    cacert: /etc/containerssh/webhook-ca.crt
    clientcert: /etc/containerssh/containerssh-client.crt
    clientkey: /etc/containerssh/containerssh-client.key

The cacert field pins the CA that signed the webhook server’s certificate. The clientcert and clientkey fields configure mutual TLS — ContainerSSH presents a client certificate when calling the webhook. This is the primary mechanism for preventing direct webhook invocation (Threat 1).

mTLS Between ContainerSSH and the Webhook

Generating the Certificate Chain

Use a dedicated CA for the ContainerSSH-to-webhook channel. Do not reuse your cluster’s internal CA or a public CA — you want to be able to revoke and rotate this certificate independently.

# Generate the internal CA
openssl genrsa -out webhook-ca.key 4096
openssl req -new -x509 -days 3650 -key webhook-ca.key \
  -subj "/CN=ContainerSSH Webhook CA/O=Internal" \
  -out webhook-ca.crt

# Generate the ContainerSSH client certificate
openssl genrsa -out containerssh-client.key 2048
openssl req -new -key containerssh-client.key \
  -subj "/CN=containerssh-server/O=Internal" \
  -out containerssh-client.csr

openssl x509 -req -days 365 -in containerssh-client.csr \
  -CA webhook-ca.crt -CAkey webhook-ca.key -CAcreateserial \
  -out containerssh-client.crt

# Generate the webhook server certificate
openssl genrsa -out webhook-server.key 2048
openssl req -new -key webhook-server.key \
  -subj "/CN=webhook.internal/O=Internal" \
  -out webhook-server.csr

cat > webhook-server-ext.cnf <<EOF
subjectAltName=DNS:webhook.internal,DNS:webhook-svc.containerssh.svc.cluster.local
EOF

openssl x509 -req -days 365 -in webhook-server.csr \
  -CA webhook-ca.crt -CAkey webhook-ca.key -CAcreateserial \
  -extfile webhook-server-ext.cnf \
  -out webhook-server.crt

Webhook Server: Enforcing Client Certificate Validation

In Go, configuring the webhook HTTP server to require and verify client certificates:

package main

import (
    "crypto/tls"
    "crypto/x509"
    "log/slog"
    "net/http"
    "os"
)

func newTLSServer(handler http.Handler) *http.Server {
    caCert, err := os.ReadFile("/etc/webhook/webhook-ca.crt")
    if err != nil {
        slog.Error("failed to read CA cert", "error", err)
        os.Exit(1)
    }

    caPool := x509.NewCertPool()
    if !caPool.AppendCertsFromPEM(caCert) {
        slog.Error("failed to parse CA cert")
        os.Exit(1)
    }

    serverCert, err := tls.LoadX509KeyPair(
        "/etc/webhook/webhook-server.crt",
        "/etc/webhook/webhook-server.key",
    )
    if err != nil {
        slog.Error("failed to load server cert/key", "error", err)
        os.Exit(1)
    }

    tlsCfg := &tls.Config{
        Certificates: []tls.Certificate{serverCert},
        ClientAuth:   tls.RequireAndVerifyClientCert,
        ClientCAs:    caPool,
        MinVersion:   tls.VersionTLS13,
    }

    return &http.Server{
        Addr:      ":8443",
        Handler:   handler,
        TLSConfig: tlsCfg,
    }
}

func main() {
    mux := http.NewServeMux()
    mux.HandleFunc("/auth", handleAuth)
    mux.HandleFunc("/config", handleConfig)

    srv := newTLSServer(mux)
    if err := srv.ListenAndServeTLS("", ""); err != nil {
        slog.Error("server error", "error", err)
        os.Exit(1)
    }
}

With tls.RequireAndVerifyClientCert, any request without a valid client certificate signed by the pinned CA is rejected at the TLS handshake — before the HTTP handler runs. An attacker calling the webhook directly without the ContainerSSH client cert gets a TLS error, not a JSON response.

To further restrict callers, verify the client certificate’s Common Name in middleware:

func requireContainerSSHCert(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        if len(r.TLS.PeerCertificates) == 0 {
            http.Error(w, "client cert required", http.StatusUnauthorized)
            return
        }
        cn := r.TLS.PeerCertificates[0].Subject.CommonName
        if cn != "containerssh-server" {
            slog.Warn("unexpected client cert CN", "cn", cn, "remote", r.RemoteAddr)
            http.Error(w, "unauthorized client", http.StatusForbidden)
            return
        }
        next.ServeHTTP(w, r)
    })
}

LDAP Integration: Group-Based Container Configuration

Python Webhook with LDAP Authentication

The following Python webhook uses ldap3 to validate credentials and resolve group membership, then returns a container configuration appropriate for the user’s group. Input sanitisation happens before any LDAP operation.

import re
import json
import logging
import ssl
from http.server import HTTPServer, BaseHTTPRequestHandler
from ldap3 import Server, Connection, ALL, SUBTREE, SAFE_SYNC
from ldap3.utils.conv import escape_filter_chars

logging.basicConfig(
    format='%(asctime)s %(levelname)s %(name)s %(message)s',
    level=logging.INFO,
)
logger = logging.getLogger("containerssh-webhook")

LDAP_SERVER = "ldaps://ldap.corp.internal:636"
LDAP_BASE_DN = "dc=corp,dc=internal"
LDAP_BIND_DN = "cn=containerssh-svc,ou=service-accounts,dc=corp,dc=internal"
LDAP_BIND_PASSWORD = os.environ["LDAP_BIND_PASSWORD"]

USERNAME_RE = re.compile(r'^[a-zA-Z0-9_-]{1,64}$')

CONTAINER_CONFIGS = {
    "ssh-admins": {
        "image": "ghcr.io/corp/admin-shell:latest",
        "runAsUser": 1000,
        "readOnlyRootFilesystem": False,
        "capabilities": {"drop": ["ALL"], "add": []},
        "extraEnv": [{"name": "KUBECONFIG", "value": "/home/user/.kube/config"}],
    },
    "ssh-developers": {
        "image": "ghcr.io/corp/dev-shell:latest",
        "runAsUser": 1000,
        "readOnlyRootFilesystem": False,
        "capabilities": {"drop": ["ALL"], "add": []},
        "extraEnv": [],
    },
    "ssh-readonly": {
        "image": "ghcr.io/corp/readonly-shell:latest",
        "runAsUser": 65534,
        "readOnlyRootFilesystem": True,
        "capabilities": {"drop": ["ALL"], "add": []},
        "extraEnv": [],
    },
}


def validate_username(username: str) -> bool:
    return bool(USERNAME_RE.match(username))


def resolve_user_group(username: str) -> str | None:
    safe_username = escape_filter_chars(username)
    server = Server(LDAP_SERVER, use_ssl=True, get_info=ALL)

    try:
        conn = Connection(
            server,
            user=LDAP_BIND_DN,
            password=LDAP_BIND_PASSWORD,
            client_strategy=SAFE_SYNC,
            auto_bind=True,
        )
    except Exception as e:
        logger.error("ldap bind failed", extra={"error": str(e)})
        return None

    for group_cn in CONTAINER_CONFIGS:
        group_dn = f"cn={group_cn},ou=groups,{LDAP_BASE_DN}"
        search_filter = (
            f"(&(uid={safe_username})(memberOf={group_dn}))"
        )
        conn.search(
            search_base=f"ou=users,{LDAP_BASE_DN}",
            search_filter=search_filter,
            search_scope=SUBTREE,
            attributes=["uid"],
        )
        if conn.entries:
            conn.unbind()
            return group_cn

    conn.unbind()
    return None


def build_config_response(group: str) -> dict:
    cfg = CONTAINER_CONFIGS[group]
    return {
        "config": {
            "backend": "kubernetes",
            "kubernetes": {
                "pod": {
                    "spec": {
                        "containers": [
                            {
                                "name": "shell",
                                "image": cfg["image"],
                                "env": cfg["extraEnv"],
                                "securityContext": {
                                    "runAsNonRoot": True,
                                    "runAsUser": cfg["runAsUser"],
                                    "readOnlyRootFilesystem": cfg["readOnlyRootFilesystem"],
                                    "allowPrivilegeEscalation": False,
                                    "capabilities": cfg["capabilities"],
                                },
                            }
                        ],
                        "securityContext": {
                            "runAsNonRoot": True,
                            "seccompProfile": {"type": "RuntimeDefault"},
                        },
                    }
                }
            },
        }
    }


class WebhookHandler(BaseHTTPRequestHandler):
    def _read_body(self) -> dict:
        length = int(self.headers.get("Content-Length", 0))
        return json.loads(self.rfile.read(length))

    def _respond(self, status: int, body: dict):
        payload = json.dumps(body).encode()
        self.send_response(status)
        self.send_header("Content-Type", "application/json")
        self.send_header("Content-Length", len(payload))
        self.end_headers()
        self.wfile.write(payload)

    def do_POST(self):
        body = self._read_body()
        username = body.get("username", "")
        connection_id = body.get("connectionId", "")
        remote_addr = body.get("remoteAddress", "")

        if not validate_username(username):
            logger.warning(
                "invalid username rejected",
                extra={"username": repr(username), "connectionId": connection_id},
            )
            self._respond(200, {"success": False, "error": "invalid username"})
            return

        if self.path == "/auth":
            self._handle_auth(body, username, connection_id, remote_addr)
        elif self.path == "/config":
            self._handle_config(body, username, connection_id, remote_addr)
        else:
            self.send_response(404)
            self.end_headers()

    def _handle_auth(self, body, username, connection_id, remote_addr):
        group = resolve_user_group(username)
        success = group is not None

        logger.info(
            "auth decision",
            extra={
                "username": username,
                "success": success,
                "group": group,
                "connectionId": connection_id,
                "remoteAddress": remote_addr,
            },
        )

        if success:
            self._respond(200, {"success": True, "error": ""})
        else:
            self._respond(200, {"success": False, "error": "user not found or not in allowed group"})

    def _handle_config(self, body, username, connection_id, remote_addr):
        group = resolve_user_group(username)
        if group is None:
            logger.error(
                "config requested for unknown user",
                extra={"username": username, "connectionId": connection_id},
            )
            self._respond(200, {"config": {}})
            return

        config_response = build_config_response(group)
        logger.info(
            "config issued",
            extra={
                "username": username,
                "group": group,
                "image": CONTAINER_CONFIGS[group]["image"],
                "connectionId": connection_id,
            },
        )
        self._respond(200, config_response)

The critical pattern in resolve_user_group is escape_filter_chars from ldap3. This function escapes all LDAP special characters (*, (, ), \, NUL) in the username before it enters the filter string. Combined with the regex validation that rejects any username not matching [a-zA-Z0-9_-]{1,64}, an attacker has no LDAP metacharacters to work with.

Request Signing with HMAC (Defence in Depth)

When mTLS is the primary caller-verification mechanism, HMAC signing provides a second independent layer — useful if a bug in the TLS configuration or a misconfigured proxy strips the client cert verification. Add a shared secret to ContainerSSH’s webhook config and verify the X-ContainerSSH-HMAC header in the webhook:

import (
    "crypto/hmac"
    "crypto/sha256"
    "encoding/hex"
    "io"
    "net/http"
    "os"
)

var hmacSecret = []byte(os.Getenv("WEBHOOK_HMAC_SECRET"))

func validateHMAC(r *http.Request, body []byte) bool {
    provided := r.Header.Get("X-ContainerSSH-HMAC")
    if provided == "" {
        return false
    }
    mac := hmac.New(sha256.New, hmacSecret)
    mac.Write(body)
    expected := hex.EncodeToString(mac.Sum(nil))
    return hmac.Equal([]byte(provided), []byte(expected))
}

func hmacMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        body, err := io.ReadAll(r.Body)
        if err != nil || !validateHMAC(r, body) {
            http.Error(w, "invalid signature", http.StatusUnauthorized)
            return
        }
        r.Body = io.NopCloser(strings.NewReader(string(body)))
        next.ServeHTTP(w, r)
    })
}

Note: ContainerSSH 0.5/0.6 does not natively add HMAC headers. This pattern requires a thin middleware proxy in front of ContainerSSH that adds the header, or a custom ContainerSSH build. Use mTLS as the primary control; HMAC is supplemental.

OIDC Integration

For environments using an OIDC provider (Okta, Keycloak, Azure AD), the webhook can validate a bearer token passed as the SSH password. SSH clients that support keyboard-interactive authentication can prompt users to paste an ID token or device-flow code.

The Go auth handler validates the token against the provider’s JWKS endpoint:

package main

import (
    "context"
    "net/http"

    "github.com/coreos/go-oidc/v3/oidc"
)

const (
    issuerURL    = "https://sso.corp.internal/realms/corp"
    clientID     = "containerssh"
)

func newOIDCVerifier(ctx context.Context) (*oidc.IDTokenVerifier, error) {
    provider, err := oidc.NewProvider(ctx, issuerURL)
    if err != nil {
        return nil, err
    }
    return provider.Verifier(&oidc.Config{ClientID: clientID}), nil
}

func handleAuthOIDC(verifier *oidc.IDTokenVerifier) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        var req AuthRequest
        if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
            http.Error(w, "bad request", http.StatusBadRequest)
            return
        }

        if req.PasswordAuthRequest == nil {
            writeJSON(w, AuthResponse{Success: false, Error: "password auth required"})
            return
        }

        ctx := r.Context()
        token, err := verifier.Verify(ctx, req.PasswordAuthRequest.Password)
        if err != nil {
            slog.Warn("token verification failed",
                "username", req.Username,
                "error", err,
                "connectionId", req.ConnectionID,
            )
            writeJSON(w, AuthResponse{Success: false, Error: "invalid token"})
            return
        }

        var claims struct {
            Email             string   `json:"email"`
            PreferredUsername string   `json:"preferred_username"`
            Groups            []string `json:"groups"`
        }
        if err := token.Claims(&claims); err != nil {
            writeJSON(w, AuthResponse{Success: false, Error: "claims extraction failed"})
            return
        }

        // Ensure the SSH username matches the token subject
        if claims.PreferredUsername != req.Username {
            slog.Warn("username mismatch",
                "sshUsername", req.Username,
                "tokenUsername", claims.PreferredUsername,
                "connectionId", req.ConnectionID,
            )
            writeJSON(w, AuthResponse{Success: false, Error: "username mismatch"})
            return
        }

        slog.Info("auth decision",
            "username", req.Username,
            "success", true,
            "groups", claims.Groups,
            "connectionId", req.ConnectionID,
        )
        writeJSON(w, AuthResponse{Success: true})
    }
}

The OIDC verifier fetches the provider’s JWKS automatically and caches it with rotation handling. token.Claims extracts the group membership, which the config webhook can use to select the appropriate container spec. The username match between the SSH username and preferred_username claim prevents a user from authenticating as another user by presenting a valid token with a different subject.

Webhook High Availability

Architecture

Run the webhook as a Kubernetes Deployment with at least three replicas behind a ClusterIP Service. ContainerSSH’s auth.webhook.url points to the Service DNS name. If one replica is down during a rolling update, the other replicas continue serving requests.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: containerssh-webhook
  namespace: containerssh
spec:
  replicas: 3
  selector:
    matchLabels:
      app: containerssh-webhook
  template:
    metadata:
      labels:
        app: containerssh-webhook
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchLabels:
                  app: containerssh-webhook
              topologyKey: kubernetes.io/hostname
      containers:
        - name: webhook
          image: ghcr.io/corp/containerssh-webhook:v1.4.2
          ports:
            - containerPort: 8443
          env:
            - name: LDAP_BIND_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: webhook-secrets
                  key: ldap-bind-password
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8443
              scheme: HTTPS
            initialDelaySeconds: 5
            periodSeconds: 10
          volumeMounts:
            - name: tls
              mountPath: /etc/webhook
              readOnly: true
      volumes:
        - name: tls
          secret:
            secretName: webhook-tls

The podAntiAffinity rule ensures replicas are spread across nodes. A single node failure does not take down the entire webhook. For environments that require cross-zone availability, replace kubernetes.io/hostname with topology.kubernetes.io/zone.

Stateless Design

Webhook replicas should be stateless: they should not cache LDAP results or session state in memory. The LDAP query is fast (sub-millisecond on a local directory), and caching group membership in-process means a user’s group change does not propagate until the cache expires. If session correlation across replicas is required (for example, to detect replay attacks on the connectionId), use a shared Redis cache with a short TTL keyed by connectionId.

Expected Behaviour

Scenario Webhook Response ContainerSSH Action
Valid user in ssh-developers group, correct password {"success": true} + dev container config SSH session opened; dev-shell container started
Valid username, wrong password {"success": false, "error": "user not found or not in allowed group"} SSH connection rejected with auth failure
Unknown username {"success": false, "error": "user not found or not in allowed group"} SSH connection rejected; no container started
Direct HTTP call to webhook without client cert TLS handshake error (no valid client certificate) Request never reaches webhook handler; caller receives TLS alert
Username containing LDAP metacharacters (*)(uid=*)) Fails regex validation; {"success": false, "error": "invalid username"} SSH connection rejected; injection attempt logged
User not in any allowed SSH group {"success": false} after LDAP search returns no results SSH connection rejected
Webhook returns empty config object ContainerSSH uses base configuration defaults Container started with potentially permissive defaults — avoid this

Trade-offs

Decision Option A Option B Guidance
Caller verification mTLS — cryptographic proof of caller identity; requires cert management and rotation HMAC signing — simpler to implement; shared secret must be securely distributed; does not prevent replay if secret is leaked Prefer mTLS for production; add HMAC as defence in depth if the infrastructure supports it
Identity provider OIDC — modern token-based flow; short-lived tokens; works with Okta, Keycloak, Azure AD; token contains claims LDAP — ubiquitous in enterprises; group membership queries are well-understood; longer-lived binding sessions; requires service account OIDC if the environment has it; LDAP for enterprise directories without an OIDC layer
Container configuration granularity Per-group config — one container spec per group; low maintenance; group membership is the unit of access Per-user config — arbitrary configuration per user; full flexibility; config store must be maintained per user Start with per-group; add per-user overrides only when groups are insufficient
Webhook availability Single instance — simple deployment; one process failure = all SSH access fails HA deployment (3+ replicas) — requires shared or stateless design; adds operational complexity; survives node failures HA is required for any production deployment; ContainerSSH fails closed on webhook unavailability

Failure Modes

Failure Symptom Impact Mitigation
LDAP server unreachable Webhook returns {"success": false} for all users; LDAP connection timeout in logs All SSH access denied; no new sessions can be established Deploy LDAP replicas; configure ldap3 connection pool with fallback URIs; set connect_timeout shorter than ContainerSSH’s webhook timeout
ContainerSSH client cert expired TLS handshake failure between ContainerSSH and webhook; webhook logs show TLS alert All SSH access denied Monitor cert expiry with Prometheus x509_cert_expiry exporter; automate rotation with cert-manager; set expiry to 90 days with 30-day renewal alert
Webhook pod crash (single replica) ContainerSSH webhook timeout; SSH connections rejected All SSH access denied until pod restarts Run minimum 3 replicas; set resource requests/limits to prevent OOM; configure readiness probe to remove unhealthy pods from Service endpoints
Redis session store unavailable (if used) Webhook cannot write/read session state; may fall back to allowing all requests depending on implementation Session deduplication and replay protection disabled Design webhook to fail closed on Redis unavailability, not open; log Redis errors; treat stateless mode as degraded, not normal
Webhook returns empty config object ContainerSSH uses built-in defaults; container may start with unintended privileges Users may get root containers or containers with broad capabilities depending on base config Validate webhook response before returning; return an error if config cannot be determined; test base configuration defaults explicitly
Webhook certificate CN mismatch ContainerSSH webhook middleware rejects caller Legitimate ContainerSSH instance cannot authenticate Ensure CN in cert matches the string checked in requireContainerSSHCert middleware; document expected CN in runbook

Audit Logging

Every auth and config decision should emit a structured log entry with fields that allow correlation across ContainerSSH, the webhook, and the container runtime. At minimum:

  • username — the SSH username
  • connectionId — ContainerSSH’s connection UUID (the cross-system correlation key)
  • remoteAddress — the SSH client’s IP
  • success — boolean auth decision
  • group — the group resolved for the user
  • image — the container image returned (config webhook only)
  • callerIP — the IP ContainerSSH connected from (webhook server side)
  • timestamp — RFC 3339 with nanosecond precision

Ship these logs to your SIEM. An alert on success=false rate exceeding baseline is an early indicator of credential stuffing or misconfiguration. An alert on unexpected image values in config webhook logs can detect webhook tampering (Threat 4).