ContainerSSH Auth Webhook as a WebAssembly Edge Function: Low-Latency Sandboxed Authentication

ContainerSSH Auth Webhook as a WebAssembly Edge Function: Low-Latency Sandboxed Authentication

The Authentication Bottleneck

ContainerSSH solves a real infrastructure problem: it intercepts SSH connections and launches an ephemeral container for each authenticated session, giving operators a programmable, auditable, and disposable SSH access layer. Every SSH connection starts with two sequential webhook calls — one to the authentication backend to verify the user’s credentials, one to the configuration backend to determine which container image and policy applies to that user. Both calls are synchronous and on the critical path: the SSH handshake waits for them.

A self-hosted webhook server that handles these calls has structural weaknesses independent of its implementation quality. Geographic distance between the ContainerSSH instance and the webhook server adds round-trip latency to every session establishment. A single webhook server handling traffic from multiple ContainerSSH deployments is a shared bottleneck — overloading it delays every SSH connection across every deployment simultaneously. A server that crashes blocks all SSH access until it recovers. And the server itself is an infrastructure component that needs to be provisioned, monitored, patched, and scaled.

The alternative architecture moves the webhook to the network edge. Cloudflare Workers and Fastly Compute run functions at Points of Presence (PoPs) distributed across the globe. A ContainerSSH instance in Frankfurt calls a Cloudflare Worker that executes in a Frankfurt PoP. A ContainerSSH instance in Singapore calls the same Worker, executing in Singapore. Response times are sub-10ms globally because the function is already colocated with the caller. Scaling is automatic — the edge platform scales to demand without operator intervention.

The WASM sandbox adds a second benefit that goes beyond performance. The authentication policy code — OIDC token validation, group membership lookup, container config selection — runs inside a WebAssembly isolate. Even if the policy code is compromised or contains a logic vulnerability, it cannot reach cloud provider metadata endpoints, internal network services, or host credentials. The blast radius of a compromised auth backend is bounded by the WASM capability model.

This article covers the ContainerSSH webhook contract, a complete Cloudflare Workers implementation with OIDC token validation, and the equivalent Rust/WASM implementation for Fastly Compute@Edge.

Target systems: ContainerSSH 0.5+, Cloudflare Workers (wrangler 3.x), Fastly Compute@Edge (Rust SDK).


Threat Model

Four threat actors are relevant to an edge-deployed ContainerSSH auth backend. Each has a distinct attack surface and requires a specific control.

Adversary 1: Edge function compromise via malicious dependency or code injection. An attacker who compromises the Worker JavaScript bundle — through a supply chain attack on an npm dependency or a CI pipeline compromise — gains control of the auth policy code. The WASM sandbox limits what they can do with that control: the V8 isolate has no access to cloud provider instance metadata, no access to the internal network behind Cloudflare, and no persistent filesystem to exfiltrate. The auth policy code can make outbound fetch() calls only to URLs the function already calls legitimately. The attacker can manipulate auth decisions but cannot pivot to the infrastructure the ContainerSSH deployment sits in front of.

Adversary 2: JWT forgery via weak SSH public key metadata. ContainerSSH supports passing an OIDC token in the SSH public key metadata field — the user’s SSH client sends a certificate or public key decorated with an OIDC bearer token as evidence of identity. An attacker who attempts to forge this token — using an expired token, a token for a different audience, or a token with a tampered payload — must defeat RS256 signature verification. The edge function validates the full JWT: signature against the IdP’s JWKS, expiry, issuer, and audience claim. A forged token is rejected at the edge, before the request touches ContainerSSH or the container infrastructure.

Adversary 3: Stale group membership cache. The edge function stores user-to-group mappings in Cloudflare KV. KV is an eventually consistent store with a configurable TTL. When an administrator removes a user from a group in the IdP, the KV entry for that user retains the old group membership until the TTL expires or an explicit cache invalidation is issued. During this window, the user continues to receive the container config that corresponds to their former group. This is not an authentication bypass — the user’s OIDC token still validates — but it is an authorization lag. The control is aggressive TTL settings (60–120 seconds) combined with a Cloudflare API-triggered KV key purge in the IdP’s user provisioning webhook.

Adversary 4: ContainerSSH calling a wrong or hijacked webhook endpoint. If the webhook URL configured in ContainerSSH is changed — via a misconfigured deployment, a DNS hijack, or a compromised configuration management system — ContainerSSH may send authentication requests to an attacker-controlled endpoint. The attacker’s endpoint returns {"success": true} for all requests, bypassing authentication entirely. The control is mTLS between ContainerSSH and the edge endpoint: the Cloudflare Worker is configured to require a client certificate, and ContainerSSH is provisioned with the corresponding key pair. A hijacked endpoint that lacks the server certificate the Worker presents will fail TLS handshake. An attacker who intercepts the request cannot impersonate the Worker without the server private key.


ContainerSSH Webhook Contract

Before implementing the edge function, the webhook protocol requires precise understanding. ContainerSSH makes two types of webhook calls: authentication and configuration. Both are HTTP POST requests with a JSON body.

Authentication webhook request:

{
  "username": "alice",
  "remoteAddress": "203.0.113.42:54321",
  "connectionId": "0123456789ABCDEF",
  "passwordAuthRequest": {
    "password": "hunter2"
  }
}

For public key authentication, the passwordAuthRequest field is replaced by publicKeyAuthRequest:

{
  "username": "alice",
  "remoteAddress": "203.0.113.42:54321",
  "connectionId": "0123456789ABCDEF",
  "publicKeyAuthRequest": {
    "publicKey": "AAAAB3NzaC1yc2EAAAA..."
  }
}

The authentication webhook must respond with:

{ "success": true }

or

{ "success": false }

Configuration webhook request:

{
  "username": "alice",
  "remoteAddress": "203.0.113.42:54321",
  "connectionId": "0123456789ABCDEF"
}

The configuration webhook responds with a ContainerSSH backend configuration block. For a Kubernetes backend:

{
  "config": {
    "backend": "kubernetes",
    "kubernetes": {
      "pod": {
        "metadata": {
          "namespace": "ssh-sessions"
        },
        "spec": {
          "containers": [{
            "name": "shell",
            "image": "ghcr.io/your-org/dev-shell:latest",
            "imagePullPolicy": "Always"
          }]
        }
      }
    }
  }
}

An empty config response ({}) tells ContainerSSH to use its default configuration. A non-2xx response from either webhook causes ContainerSSH to reject the connection.


Cloudflare Workers Implementation

Worker Architecture

The Worker handles both the authentication and configuration webhook paths. A shared secret (X-ContainerSSH-Token) sent by ContainerSSH is validated before any processing occurs. After validating the token, the auth path extracts the OIDC bearer token from the SSH public key metadata, validates it, and returns success or failure. The config path looks up the user’s group from KV and returns the appropriate container configuration.

wrangler.toml

name = "containerssh-auth"
main = "src/index.ts"
compatibility_date = "2026-01-01"
compatibility_flags = ["nodejs_compat"]

[[kv_namespaces]]
binding = "USER_GROUPS"
id = "your-kv-namespace-id-here"
preview_id = "your-kv-preview-namespace-id-here"

[[kv_namespaces]]
binding = "JWKS_CACHE"
id = "your-jwks-cache-namespace-id-here"
preview_id = "your-jwks-cache-preview-namespace-id-here"

[vars]
OIDC_ISSUER = "https://accounts.example.com"
OIDC_AUDIENCE = "containerssh"

# Secrets set via: wrangler secret put CONTAINERSSH_SHARED_SECRET
# wrangler secret put JWKS_URL

KV namespaces are bound directly to the Worker environment — no credentials, no HTTP calls to a KV API. The shared secret and JWKS URL are set as Cloudflare secrets, which are encrypted at rest and injected at runtime. They do not appear in wrangler.toml or the bundle.

Full Worker Implementation

// src/index.ts
export interface Env {
  USER_GROUPS: KVNamespace;
  JWKS_CACHE: KVNamespace;
  CONTAINERSSH_SHARED_SECRET: string;
  OIDC_ISSUER: string;
  OIDC_AUDIENCE: string;
  JWKS_URL: string;
}

interface ContainerSSHAuthRequest {
  username: string;
  remoteAddress: string;
  connectionId: string;
  passwordAuthRequest?: { password: string };
  publicKeyAuthRequest?: { publicKey: string };
}

interface ContainerSSHConfigRequest {
  username: string;
  remoteAddress: string;
  connectionId: string;
}

interface JWTHeader {
  alg: string;
  kid: string;
}

interface JWTClaims {
  sub: string;
  iss: string;
  aud: string | string[];
  exp: number;
  email?: string;
  groups?: string[];
}

interface JWKSKey {
  kty: string;
  use: string;
  kid: string;
  n: string;
  e: string;
  alg: string;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Validate shared secret on every request
    const authHeader = request.headers.get("X-ContainerSSH-Token");
    if (!authHeader || authHeader !== env.CONTAINERSSH_SHARED_SECRET) {
      return new Response(JSON.stringify({ error: "unauthorized" }), {
        status: 401,
        headers: { "Content-Type": "application/json" },
      });
    }

    const url = new URL(request.url);

    if (request.method !== "POST") {
      return new Response(JSON.stringify({ error: "method not allowed" }), {
        status: 405,
        headers: { "Content-Type": "application/json" },
      });
    }

    try {
      if (url.pathname === "/auth") {
        return await handleAuth(request, env);
      } else if (url.pathname === "/config") {
        return await handleConfig(request, env);
      } else {
        return new Response(JSON.stringify({ error: "not found" }), {
          status: 404,
          headers: { "Content-Type": "application/json" },
        });
      }
    } catch (err) {
      // Fail closed: any unexpected error rejects the connection
      console.error("Handler error:", err);
      return new Response(JSON.stringify({ success: false }), {
        status: 200,
        headers: { "Content-Type": "application/json" },
      });
    }
  },
};

async function handleAuth(request: Request, env: Env): Promise<Response> {
  const body: ContainerSSHAuthRequest = await request.json();

  // Password auth: extract OIDC token from the password field
  // Users pass their OIDC token as the SSH password for token-based auth
  if (body.passwordAuthRequest) {
    const token = body.passwordAuthRequest.password;
    const valid = await validateOIDCToken(token, env);
    return new Response(JSON.stringify({ success: valid }), {
      status: 200,
      headers: { "Content-Type": "application/json" },
    });
  }

  // Public key auth: decode the public key blob and extract OIDC token
  // from the certificate extensions (for OpenSSH certificates) or
  // check the key fingerprint against an allow list in KV
  if (body.publicKeyAuthRequest) {
    const allowed = await checkPublicKeyAllowList(
      body.username,
      body.publicKeyAuthRequest.publicKey,
      env
    );
    return new Response(JSON.stringify({ success: allowed }), {
      status: 200,
      headers: { "Content-Type": "application/json" },
    });
  }

  // Neither auth method provided: reject
  return new Response(JSON.stringify({ success: false }), {
    status: 200,
    headers: { "Content-Type": "application/json" },
  });
}

async function handleConfig(request: Request, env: Env): Promise<Response> {
  const body: ContainerSSHConfigRequest = await request.json();

  // Look up user group from KV (TTL-bounded cache)
  const group = await env.USER_GROUPS.get(body.username);
  const containerConfig = buildContainerConfig(group ?? "readonly");

  return new Response(JSON.stringify({ config: containerConfig }), {
    status: 200,
    headers: { "Content-Type": "application/json" },
  });
}

async function validateOIDCToken(token: string, env: Env): Promise<boolean> {
  try {
    const parts = token.split(".");
    if (parts.length !== 3) return false;

    const header: JWTHeader = JSON.parse(atob(parts[0]));
    const claims: JWTClaims = JSON.parse(atob(parts[1]));

    // Check expiry before hitting JWKS — fast fail on expired tokens
    const nowSeconds = Math.floor(Date.now() / 1000);
    if (claims.exp <= nowSeconds) return false;

    // Validate issuer
    if (claims.iss !== env.OIDC_ISSUER) return false;

    // Validate audience
    const aud = Array.isArray(claims.aud) ? claims.aud : [claims.aud];
    if (!aud.includes(env.OIDC_AUDIENCE)) return false;

    // Fetch JWKS (from KV cache or IdP endpoint)
    const jwks = await getJWKS(env);
    const key = jwks.find((k) => k.kid === header.kid);
    if (!key) return false;

    // Import the RSA public key and verify the RS256 signature
    const cryptoKey = await importRSAPublicKey(key);
    const signatureInput = new TextEncoder().encode(`${parts[0]}.${parts[1]}`);
    const signature = base64UrlDecode(parts[2]);

    return await crypto.subtle.verify(
      { name: "RSASSA-PKCS1-v1_5", hash: "SHA-256" },
      cryptoKey,
      signature,
      signatureInput
    );
  } catch {
    return false;
  }
}

async function getJWKS(env: Env): Promise<JWKSKey[]> {
  // Try KV cache first (cached with 1-hour TTL)
  const cached = await env.JWKS_CACHE.get("jwks", "json");
  if (cached) return cached as JWKSKey[];

  // Fetch from IdP
  const response = await fetch(env.JWKS_URL, {
    headers: { Accept: "application/json" },
    cf: { cacheTtl: 3600 },
  });

  if (!response.ok) {
    // If fetch fails and we have no cache, fail closed
    throw new Error(`JWKS fetch failed: ${response.status}`);
  }

  const jwks = await response.json<{ keys: JWKSKey[] }>();

  // Store in KV with 1-hour TTL
  await env.JWKS_CACHE.put("jwks", JSON.stringify(jwks.keys), {
    expirationTtl: 3600,
  });

  return jwks.keys;
}

async function importRSAPublicKey(jwk: JWKSKey): Promise<CryptoKey> {
  return crypto.subtle.importKey(
    "jwk",
    { kty: jwk.kty, n: jwk.n, e: jwk.e, alg: jwk.alg, use: jwk.use },
    { name: "RSASSA-PKCS1-v1_5", hash: "SHA-256" },
    false,
    ["verify"]
  );
}

async function checkPublicKeyAllowList(
  username: string,
  publicKey: string,
  env: Env
): Promise<boolean> {
  // KV key: "pubkey:<username>:<sha256-fingerprint-of-key>"
  // Value: "allowed" | absent
  const fingerprint = await sha256Hex(publicKey);
  const kvKey = `pubkey:${username}:${fingerprint}`;
  const entry = await env.USER_GROUPS.get(kvKey);
  return entry === "allowed";
}

function buildContainerConfig(group: string): object {
  const baseSpec = {
    metadata: { namespace: "ssh-sessions" },
    spec: {
      automountServiceAccountToken: false,
      securityContext: {
        runAsNonRoot: true,
        seccompProfile: { type: "RuntimeDefault" },
      },
    },
  };

  const groupConfigs: Record<string, object> = {
    dev: {
      backend: "kubernetes",
      kubernetes: {
        pod: {
          ...baseSpec,
          spec: {
            ...baseSpec.spec,
            containers: [{
              name: "shell",
              image: "ghcr.io/your-org/dev-shell:v2.1.0",
              imagePullPolicy: "Always",
              resources: {
                limits: { cpu: "500m", memory: "512Mi" },
                requests: { cpu: "100m", memory: "128Mi" },
              },
              securityContext: {
                allowPrivilegeEscalation: false,
                readOnlyRootFilesystem: false,
                capabilities: { drop: ["ALL"] },
              },
            }],
          },
        },
      },
    },
    admin: {
      backend: "kubernetes",
      kubernetes: {
        pod: {
          ...baseSpec,
          metadata: { ...baseSpec.metadata, namespace: "ssh-admin" },
          spec: {
            ...baseSpec.spec,
            containers: [{
              name: "shell",
              image: "ghcr.io/your-org/admin-shell:v2.1.0",
              imagePullPolicy: "Always",
              resources: {
                limits: { cpu: "1000m", memory: "1Gi" },
                requests: { cpu: "200m", memory: "256Mi" },
              },
              securityContext: {
                allowPrivilegeEscalation: false,
                readOnlyRootFilesystem: false,
                capabilities: { drop: ["ALL"] },
              },
            }],
          },
        },
      },
    },
    readonly: {
      backend: "kubernetes",
      kubernetes: {
        pod: {
          ...baseSpec,
          spec: {
            ...baseSpec.spec,
            containers: [{
              name: "shell",
              image: "ghcr.io/your-org/readonly-shell:v2.1.0",
              imagePullPolicy: "Always",
              resources: {
                limits: { cpu: "200m", memory: "256Mi" },
                requests: { cpu: "50m", memory: "64Mi" },
              },
              securityContext: {
                allowPrivilegeEscalation: false,
                readOnlyRootFilesystem: true,
                capabilities: { drop: ["ALL"] },
              },
            }],
          },
        },
      },
    },
  };

  return groupConfigs[group] ?? groupConfigs["readonly"];
}

function base64UrlDecode(input: string): Uint8Array {
  const base64 = input.replace(/-/g, "+").replace(/_/g, "/");
  const padded = base64.padEnd(base64.length + ((4 - (base64.length % 4)) % 4), "=");
  const binary = atob(padded);
  return Uint8Array.from(binary, (c) => c.charCodeAt(0));
}

async function sha256Hex(input: string): Promise<string> {
  const data = new TextEncoder().encode(input);
  const hash = await crypto.subtle.digest("SHA-256", data);
  return Array.from(new Uint8Array(hash))
    .map((b) => b.toString(16).padStart(2, "0"))
    .join("");
}

ContainerSSH Configuration

Configure ContainerSSH to call the edge Worker endpoints:

# containerssh.yaml
auth:
  webhook:
    url: "https://containerssh-auth.your-account.workers.dev/auth"
    timeout: 10s
    headers:
      X-ContainerSSH-Token: "${CONTAINERSSH_SHARED_SECRET}"
    tlsOptions:
      # Client certificate for mTLS authentication to the Worker
      clientKeyFile: "/etc/containerssh/client.key"
      clientCertFile: "/etc/containerssh/client.crt"
      # Pin the Worker's CA (Cloudflare's certificate authority)
      cacertFile: "/etc/containerssh/cloudflare-ca.crt"

configserver:
  webhook:
    url: "https://containerssh-auth.your-account.workers.dev/config"
    timeout: 10s
    headers:
      X-ContainerSSH-Token: "${CONTAINERSSH_SHARED_SECRET}"
    tlsOptions:
      clientKeyFile: "/etc/containerssh/client.key"
      clientCertFile: "/etc/containerssh/client.crt"
      cacertFile: "/etc/containerssh/cloudflare-ca.crt"

Cache Invalidation via Cloudflare API

When group membership changes in the IdP, purge the corresponding KV entry to eliminate stale authorization windows. This is called from your IdP’s provisioning webhook or a CI/CD pipeline:

// invalidate-user-group.ts — called from IdP provisioning webhook
async function invalidateUserGroup(
  username: string,
  accountId: string,
  namespaceId: string,
  apiToken: string
): Promise<void> {
  const url = `https://api.cloudflare.com/client/v4/accounts/${accountId}/storage/kv/namespaces/${namespaceId}/values/${username}`;

  const response = await fetch(url, {
    method: "DELETE",
    headers: {
      Authorization: `Bearer ${apiToken}`,
      "Content-Type": "application/json",
    },
  });

  if (!response.ok) {
    throw new Error(`KV invalidation failed: ${response.status}`);
  }
}

After invalidation, the next auth request for that user triggers a fresh KV lookup. If the KV entry is absent, the Worker re-fetches the user’s group from the authoritative source (OIDC token claims or a re-populated KV entry from your provisioning system) or falls back to the readonly group.


Fastly Compute@Edge Alternative (Rust/WASM)

For operators already on Fastly or who prefer a compiled WASM binary over JavaScript, the equivalent implementation in Rust:

# fastly.toml
manifest_version = 3
name = "containerssh-auth"
language = "rust"
description = "ContainerSSH auth webhook at the edge"

[local_server.kv_stores.user_groups]
format = "json"
file = "local-kv/user-groups.json"

[setup.kv_stores.user_groups]
name = "user-groups"

[setup.kv_stores.jwks_cache]
name = "jwks-cache"
// src/main.rs
use fastly::http::{Method, StatusCode};
use fastly::{Error, KVStore, Request, Response};
use serde::{Deserialize, Serialize};

#[derive(Deserialize)]
struct AuthRequest {
    username: String,
    #[serde(rename = "connectionId")]
    connection_id: String,
    #[serde(rename = "passwordAuthRequest")]
    password_auth: Option<PasswordAuth>,
}

#[derive(Deserialize)]
struct PasswordAuth {
    password: String,
}

#[derive(Serialize)]
struct AuthResponse {
    success: bool,
}

#[derive(Serialize)]
struct ConfigResponse {
    config: serde_json::Value,
}

#[fastly::main]
fn main(req: Request) -> Result<Response, Error> {
    // Validate shared secret
    let shared_secret = std::env::var("CONTAINERSSH_SHARED_SECRET")
        .unwrap_or_default();
    let token_header = req
        .get_header_str("X-ContainerSSH-Token")
        .unwrap_or_default();

    if token_header != shared_secret {
        return Ok(Response::from_status(StatusCode::UNAUTHORIZED)
            .with_body_text_plain("unauthorized\n"));
    }

    if req.get_method() != Method::POST {
        return Ok(Response::from_status(StatusCode::METHOD_NOT_ALLOWED)
            .with_body_text_plain("method not allowed\n"));
    }

    match req.get_path() {
        "/auth" => handle_auth(req),
        "/config" => handle_config(req),
        _ => Ok(Response::from_status(StatusCode::NOT_FOUND)
            .with_body_text_plain("not found\n")),
    }
}

fn handle_auth(mut req: Request) -> Result<Response, Error> {
    let body: AuthRequest = req.take_body_json()?;

    let success = if let Some(pw_auth) = body.password_auth {
        // Validate OIDC token from password field
        validate_oidc_token(&pw_auth.password)
    } else {
        false
    };

    let resp = AuthResponse { success };
    Ok(Response::from_status(StatusCode::OK)
        .with_body_json(&resp)?)
}

fn handle_config(mut req: Request) -> Result<Response, Error> {
    let body: AuthRequest = req.take_body_json()?;

    // Look up group from KV store
    let group = get_user_group(&body.username).unwrap_or_else(|| "readonly".to_string());
    let config = build_container_config(&group);

    let resp = ConfigResponse { config };
    Ok(Response::from_status(StatusCode::OK)
        .with_body_json(&resp)?)
}

fn get_user_group(username: &str) -> Option<String> {
    let store = KVStore::open("user-groups").ok()??;
    let entry = store.lookup(username).ok()??;
    let text = entry.into_string();
    Some(text.trim().to_string())
}

fn validate_oidc_token(token: &str) -> bool {
    // Production implementation: use a JWT crate (jsonwebtoken)
    // with the JWKS fetched from KVStore or a Fastly backend.
    // This stub validates structural form only; replace with full
    // signature verification before deployment.
    let parts: Vec<&str> = token.split('.').collect();
    if parts.len() != 3 {
        return false;
    }
    // Full implementation: decode header, fetch JWKS from KV,
    // verify RS256 signature with the matching kid key.
    true
}

fn build_container_config(group: &str) -> serde_json::Value {
    match group {
        "admin" => serde_json::json!({
            "backend": "kubernetes",
            "kubernetes": {
                "pod": {
                    "metadata": { "namespace": "ssh-admin" },
                    "spec": {
                        "containers": [{ "name": "shell", "image": "ghcr.io/your-org/admin-shell:v2.1.0" }]
                    }
                }
            }
        }),
        "dev" => serde_json::json!({
            "backend": "kubernetes",
            "kubernetes": {
                "pod": {
                    "metadata": { "namespace": "ssh-sessions" },
                    "spec": {
                        "containers": [{ "name": "shell", "image": "ghcr.io/your-org/dev-shell:v2.1.0" }]
                    }
                }
            }
        }),
        _ => serde_json::json!({
            "backend": "kubernetes",
            "kubernetes": {
                "pod": {
                    "metadata": { "namespace": "ssh-sessions" },
                    "spec": {
                        "containers": [{ "name": "shell", "image": "ghcr.io/your-org/readonly-shell:v2.1.0" }]
                    }
                }
            }
        }),
    }
}

Deploy with:

fastly compute build
fastly compute deploy

Fastly AOT-compiles the .wasm binary at deploy time. Each request gets a fresh Wasmtime instance — no warm state carrying over between requests, stronger isolation than the Worker’s warm isolate pool for stateful attacks.


Expected Behaviour

Scenario Response Time Auth Decision Notes
Valid SSH connect, valid OIDC token, user in KV < 8 ms Allow + group config KV hit, JWKS from KV cache, RS256 verify in-isolate
Valid SSH connect, invalid OIDC token (bad signature) < 5 ms Deny Fails at RS256 verify; no KV read needed
Valid SSH connect, expired OIDC token < 3 ms Deny Fails at exp check before JWKS fetch
Revoked group membership, KV entry not yet purged < 8 ms Allow with old config Stale KV window; mitigated by short TTL + explicit purge
Direct HTTP call without X-ContainerSSH-Token < 2 ms 401 Unauthorized Shared secret check is first operation; no business logic runs
Worker cold start (first request to a new PoP) 20–80 ms Allow or deny (correct) Isolate init + JWKS fetch; subsequent requests use warm isolate
JWKS endpoint unreachable, KV cache present < 8 ms Allow or deny (correct) KV cache serves JWKS; function proceeds normally
JWKS endpoint unreachable, KV cache expired < 10 ms Deny Fail closed; no token can be validated without JWKS

Cold start latency (20–80 ms) occurs on the first request to a new PoP or after a Worker deployment. ContainerSSH’s 10-second webhook timeout means even a cold start does not cause a connection failure. Subsequent requests from the same PoP use a warm isolate and complete sub-10ms.


Trade-offs

Dimension Edge Worker Approach Self-Hosted Webhook
Geographic latency < 10 ms globally (PoP colocation) 1–50+ ms depending on distance and routing
Availability Platform SLA (Cloudflare: 99.99%); no single PoP failure affects all users Single server failure blocks all SSH unless you run a HA pair with load balancing
Scaling Automatic; no operator action during load spikes Manual scaling or autoscaling configuration required
Group membership freshness Eventually consistent via KV (configurable TTL 60–3600 s) Can query LDAP/IdP synchronously on every request (strongly consistent)
LDAP/Active Directory integration Not directly possible from Worker; must pre-populate KV from a sync job Direct LDAP client call from webhook server; no sync required
Blast radius of code compromise WASM sandbox; no metadata endpoint, no internal network, no persistent disk Full host access if webhook server is compromised
KV consistency model (Cloudflare) Cloudflare KV: eventual consistency; writes replicate in seconds Not applicable (direct database query)
Durable Objects (strong consistency) Durable Objects provide strong consistency but add ~5–15 ms for cross-PoP reads Not applicable
Cost at low volume Cloudflare Workers Free: 100K requests/day; Paid: $0.30/million requests Self-hosted: fixed infrastructure cost regardless of request volume
Cost at high volume Workers pricing scales linearly; predictable Fixed cost; more efficient per-request at high volume
Portability Tied to Cloudflare (or Fastly) platform APIs Portable; runs anywhere with a runtime
Observability Cloudflare Logpush to SIEM; Workers Analytics; limited per-request detail Full logging flexibility; structured logs to any target
Deployment complexity wrangler deploy; no server provisioning Server provisioning, TLS termination, process management, monitoring

Failure Modes

Failure Impact Detection Mitigation
Cloudflare PoP outage ContainerSSH cannot reach auth webhook; SSH connections fail at webhook timeout ContainerSSH error logs; SSH connection refused Configure a fallback webhook URL in ContainerSSH pointing to a self-hosted server; ContainerSSH does not support fallback natively — use a health-check proxy in front of both endpoints
KV cache stale after group revocation User retains access to old container config until TTL expires Audit logs showing sessions from revoked users Set KV TTL to 60–120 s; trigger explicit KV key deletion from IdP provisioning webhook on user role change
JWKS endpoint unreachable from Worker No OIDC tokens can be validated; all token-based auth fails Workers error logs; spike in auth failures Cache JWKS in KV with 1-hour TTL; fail closed if KV entry expired and JWKS unreachable (do not fail open)
Worker bundle size exceeds 10 MB (Paid plan limit) Deployment fails Wrangler deploy error Split policy logic: move JWKS validation to a separate Worker; use Service Bindings to call it from the main auth Worker
Shared secret rotation without atomic rollout New ContainerSSH config has new secret; old Workers deployment has old secret 401 errors on auth webhook calls Use a grace period: validate both old and new secrets for a 5-minute window during rotation; remove old secret after confirming new secret is active
mTLS certificate expiry on ContainerSSH client TLS handshake fails; auth webhook unreachable ContainerSSH TLS errors; SSH connection refused Automate certificate renewal with cert-manager; set certificate validity to 90 days with renewal at 60 days; alert at 14 days remaining
Worker cold start during SSH connection spike First requests to new PoPs see 20–80 ms latency instead of sub-10 ms Workers Analytics latency percentile spike Acceptable — within ContainerSSH’s 10 s timeout; use Workers minInstances (Paid plan) to maintain warm instances
Fastly Compute AOT cache miss after deployment Slightly higher latency on first request after deploy at each PoP Fastly logs AOT compilation happens at deploy time; Fastly pre-distributes compiled binary to all PoPs