ContainerSSH as a Bastion Host Replacement: Ephemeral Containers per SSH Session
Problem
A traditional bastion host starts life as a clean, hardened jump server. Six months later it has accumulated forty system user accounts (some belonging to people who left the company), a dozen authorized_keys files of uncertain provenance, shared credentials for downstream systems stored in home directories, and no isolation whatsoever between concurrent sessions. It is the most sensitive host on the network and the least frequently audited.
The structural problems are inherent to the model, not the configuration:
Persistent user accounts. Adding a new operator means creating a system user on the bastion. Offboarding requires deleting it — but offboarding processes fail, people change roles, contractors finish engagements. Accounts accumulate. Each one is a potential entry point.
No session isolation. Two administrators logged in simultaneously share the same kernel, the same process namespace, and often the same filesystem. A compromised session can read /proc/<pid>/environ of adjacent processes, scrape credentials from memory with ptrace, or simply write to shared directories.
Shared state becomes a liability. Files placed on the bastion persist. Operators copy credentials, keys, or sensitive data to the bastion for convenience. Over time it becomes a data store nobody intended to create.
Lateral movement risk. Compromising the bastion gives an attacker an authenticated foothold on the internal network with established trust relationships to every host the bastion reaches. The bastion’s network position — wide egress, trusted by internal hosts — makes post-exploitation trivial.
No automatic cleanup. When a session ends, nothing changes on the host. Logs persist in the user’s home directory. Bash history accumulates. Files remain until manually purged.
ContainerSSH replaces this model with ephemeral isolation. Every SSH connection triggers a webhook authentication call, which returns a container specification. ContainerSSH launches a fresh container, attaches the SSH session to it, and destroys the container when the connection closes. The bastion host itself runs no persistent user sessions. There are no system users to accumulate. The only persistent state is the ContainerSSH binary, its config, and the SSH host key.
Target systems: Linux hosts running Docker, Podman, or with access to a Kubernetes cluster. ContainerSSH supports Docker, Podman, and Kubernetes backends. Minimum Go 1.21 for building from source; prebuilt binaries and container images are available. The auth webhook can be written in any language.
Threat Model
The threat model for a bastion host is one of concentration risk. The bastion aggregates access to the entire internal network, so its compromise has outsized consequences.
Adversary 1 — Attacker compromises an active session. Via a vulnerability in a tool installed on the bastion, an attacker escapes from a user’s shell into the bastion’s OS.
- Traditional bastion: Attacker is now on the bastion OS with access to all other active sessions, all home directories, and the bastion’s network interfaces. They can read other sessions via
/proc, scrape credentials from memory, exfiltrate files from shared directories, and begin lateral movement to internal hosts immediately. - ContainerSSH: Attacker is inside a container with the network, filesystem, and process namespace of that single session. Other sessions run in separate containers. The bastion OS is not directly accessible. Escape requires a container breakout (a separate, harder exploit class). The blast radius is scoped to one container’s lifetime.
Adversary 2 — Credential theft from the bastion host itself. An attacker with read access to the bastion’s filesystem.
- Traditional bastion: All user home directories are readable. SSH private keys, API tokens,
.bash_historyfiles containing plaintext commands with embedded secrets — all present, all persistent. - ContainerSSH: The bastion filesystem contains the ContainerSSH binary, a config file, and the SSH host key. No user home directories. No persistent credentials. Auth decisions are made by the external webhook, not by files on disk.
Adversary 3 — Insider threat / data exfiltration after session ends. A malicious operator copies sensitive data to the bastion and retrieves it later.
- Traditional bastion: Data persists indefinitely in the user’s home directory unless purged by policy. No automatic enforcement.
- ContainerSSH: The container is destroyed on disconnect. Any data inside it is gone. There is no persistent storage unless the container spec explicitly mounts a volume — which the auth webhook controls and can deny.
Adversary 4 — Stale account exploitation. A former employee’s account on the bastion.
- Traditional bastion: The account exists until manually removed. SSH keys may still be valid. Organisational offboarding processes are imperfect.
- ContainerSSH: There are no system accounts on the bastion host. Authentication is entirely delegated to the webhook. Disabling access means updating the auth backend (LDAP, IAM, a database row). No bastion-specific cleanup required.
| Scenario | Traditional Bastion Blast Radius | ContainerSSH Blast Radius |
|---|---|---|
| Active session compromise | Full host access, all sessions, internal network | Single session container, container lifetime only |
| Filesystem read access | All user home dirs, all credentials, history files | ContainerSSH binary, config, host key — no user data |
| Post-session data retrieval | Data persists indefinitely | Container destroyed on disconnect |
| Stale account | Valid until manually removed, SSH keys live on host | No system accounts; disable in auth backend only |
| Auth backend compromised | N/A (no central auth) | Full SSH access granted; requires auth backend hardening |
Configuration
Architecture Overview
ContainerSSH sits in front of a container backend. The flow for every SSH connection is:
- Client connects to ContainerSSH’s listening port (default
2222). - ContainerSSH calls the config webhook with the username and returns per-user backend configuration (optional; can be static).
- ContainerSSH calls the auth webhook with the username and credential (password or public key). The webhook returns
trueorfalse, and optionally overrides the container configuration for that user. - On successful auth, ContainerSSH instructs the backend (Docker, Kubernetes, Podman) to launch a container.
- The SSH session is attached to the container process. The user gets a shell inside the container.
- On disconnect, ContainerSSH signals the backend to remove the container.
Client ──SSH──► ContainerSSH ──webhook──► Auth Service (LDAP/IAM/DB)
│
└──backend API──► Docker / Kubernetes / Podman
│
└──► Ephemeral Container (per session)
Installing ContainerSSH
# Download the latest release binary (check https://containerssh.io/releases for current version)
CSSH_VERSION="0.5.1"
curl -Lo /usr/local/bin/containerssh \
"https://github.com/ContainerSSH/ContainerSSH/releases/download/v${CSSH_VERSION}/containerssh-linux-amd64"
chmod +x /usr/local/bin/containerssh
# Verify the checksum (replace with the actual SHA256 from the release page)
curl -Lo /tmp/containerssh.sha256 \
"https://github.com/ContainerSSH/ContainerSSH/releases/download/v${CSSH_VERSION}/containerssh-linux-amd64.sha256"
sha256sum -c /tmp/containerssh.sha256
# Generate the SSH host key for ContainerSSH
# This is the key clients will verify — keep it stable and back it up.
mkdir -p /etc/containerssh
ssh-keygen -t ed25519 -f /etc/containerssh/host_key -C "containerssh-bastion" -N ""
chmod 600 /etc/containerssh/host_key
Minimal config.yaml (Docker Backend)
# /etc/containerssh/config.yaml
# ContainerSSH minimal configuration — Docker backend
log:
level: info
format: json
ssh:
listen: "0.0.0.0:2222"
# Path to the SSH host private key generated above.
# Clients will see this key's fingerprint — rotate with care.
hostkeys:
- /etc/containerssh/host_key
auth:
# ContainerSSH calls this URL for every authentication attempt.
# The webhook receives the username and credential; returns success/failure.
webhook:
url: "http://auth-service.internal:8080/auth"
timeout: 5s
configserver:
# Optional: per-user container config overrides.
# Omit this section to use static backend config for all users.
url: "http://auth-service.internal:8080/config"
timeout: 5s
backend: docker
docker:
connection:
# Docker socket — use a Unix socket for local Docker, TCP for remote.
host: "unix:///var/run/docker.sock"
execution:
# Container image for SSH sessions.
# Build and maintain this image separately — see "Session Container Image" below.
launch:
containerConfig:
image: "registry.internal/bastion-shell:latest"
# Run as a non-root user inside the container.
user: "10000:10000"
# No privilege escalation inside the container.
securityOpt:
- "no-new-privileges:true"
# Read-only root filesystem — no persistent writes.
readonlyRootfs: true
# Tmpfs for /tmp so tools that need temp space still work.
tmpfs:
/tmp: "rw,noexec,nosuid,size=64m"
# Automatically remove the container when the SSH session ends.
# This is ContainerSSH's default behaviour; shown explicitly for clarity.
removeOnExit: true
# Limit resources to prevent a single session from affecting others.
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
Auth Webhook
ContainerSSH sends a JSON POST to the auth URL for each authentication attempt. The payload structure differs slightly for password vs. public key auth:
// Password auth payload sent by ContainerSSH to the auth webhook
{
"username": "alice",
"remoteAddress": "203.0.113.45:54321",
"connectionId": "b3f2a...",
"passwordBase64": "c2VjcmV0"
}
// Public key auth payload
{
"username": "alice",
"remoteAddress": "203.0.113.45:54321",
"connectionId": "b3f2a...",
"publicKeyBase64": "AAAAB3NzaC1yc2EAAAA..."
}
The webhook returns a JSON object indicating success or failure:
// Success response
{"success": true}
// Failure response
{"success": false}
A minimal Python webhook that validates against a static list (replace with LDAP, IAM, or a database lookup in production):
# auth_webhook.py — minimal ContainerSSH auth webhook
# Run with: uvicorn auth_webhook:app --host 0.0.0.0 --port 8080
import base64
import hashlib
import hmac
from fastapi import FastAPI, Request
from pydantic import BaseModel
from typing import Optional
app = FastAPI()
# In production: look these up from LDAP, a database, or an IAM service.
# Keys are usernames; values are lists of authorised public key fingerprints (SHA256).
AUTHORISED_KEYS: dict[str, list[str]] = {
"alice": ["SHA256:AbCdEfGhIjKlMnOpQrStUvWxYz0123456789abcdef="],
"bob": ["SHA256:ZyXwVuTsRqPoNmLkJiHgFeDcBa9876543210fedcba="],
}
class AuthRequest(BaseModel):
username: str
remoteAddress: str
connectionId: str
passwordBase64: Optional[str] = None
publicKeyBase64: Optional[str] = None
def ssh_pubkey_fingerprint(pubkey_b64: str) -> str:
"""Derive SHA256 fingerprint from base64-encoded SSH public key blob."""
raw = base64.b64decode(pubkey_b64)
digest = hashlib.sha256(raw).digest()
fp = base64.b64encode(digest).decode().rstrip("=")
return f"SHA256:{fp}"
@app.post("/auth")
async def authenticate(req: AuthRequest):
allowed_fps = AUTHORISED_KEYS.get(req.username, [])
if req.publicKeyBase64:
fp = ssh_pubkey_fingerprint(req.publicKeyBase64)
success = fp in allowed_fps
else:
# Password auth — not recommended; shown for completeness.
# In practice: reject password auth entirely or validate against MFA.
success = False
# Log every attempt — structured for SIEM ingestion.
import json, sys
print(json.dumps({
"event": "auth_attempt",
"username": req.username,
"remote_address": req.remoteAddress,
"connection_id": req.connectionId,
"auth_type": "pubkey" if req.publicKeyBase64 else "password",
"success": success,
}), file=sys.stderr)
return {"success": success}
@app.post("/config")
async def container_config(req: Request):
# Optional: return per-user container config overrides.
# Return an empty object to use the static config from config.yaml.
return {}
Session Container Image
The container image defines the tools available inside each SSH session. Apply the same principles as a distroless or hardened base image: include only what operators need, nothing more.
# Dockerfile.bastion-shell
# Minimal session container for ContainerSSH bastion access.
# Build: docker build -t registry.internal/bastion-shell:latest -f Dockerfile.bastion-shell .
FROM debian:12-slim AS base
# Install only the tools operators need for bastion access.
# Adjust to your environment — add kubectl, aws-cli, etc. as required.
RUN apt-get update && apt-get install -y --no-install-recommends \
bash \
openssh-client \
curl \
ca-certificates \
less \
vim-tiny \
jq \
netcat-openbsd \
iputils-ping \
dnsutils \
&& rm -rf /var/lib/apt/lists/*
# Create a non-root user for the SSH session.
# All sessions run as this user regardless of the username used to connect.
RUN groupadd -g 10000 operator && \
useradd -u 10000 -g operator -m -s /bin/bash -d /home/operator operator
# No package manager cache, no setuid binaries beyond what the base requires.
RUN find / -perm /4000 -type f 2>/dev/null | \
grep -v -E '^/(bin/su|usr/bin/passwd|usr/bin/newgrp)$' | \
xargs chmod u-s 2>/dev/null || true
USER 10000:10000
WORKDIR /home/operator
# ContainerSSH will exec the shell directly — no SSH daemon needed in the container.
CMD ["/bin/bash"]
Build and push this image to your internal registry. Tag by date or digest, not just latest, so container launches are reproducible and rollback is straightforward.
SSH Certificate Authority Integration
If your organisation uses an SSH CA (see SSH Certificate Authority), the auth webhook validates certificates rather than raw public keys. The certificate’s principal becomes the identity passed to your authorisation logic.
ContainerSSH passes the presented public key (or certificate public key) to the auth webhook. To accept certificates signed by your CA, the webhook extracts the certificate, verifies the CA signature, checks the principal, and validates the validity window:
# Certificate validation addition to auth_webhook.py
import subprocess
import tempfile
import os
def validate_ssh_certificate(pubkey_b64: str, username: str, ca_pubkey_path: str) -> bool:
"""
Validate an SSH certificate presented via ContainerSSH's auth webhook.
Returns True if the cert is valid, signed by the trusted CA, and the
principal matches the connecting username.
"""
raw = base64.b64decode(pubkey_b64)
# Write cert to a temp file for ssh-keygen inspection
with tempfile.NamedTemporaryFile(suffix="-cert.pub", delete=False) as f:
f.write(pubkey_b64.encode())
cert_path = f.name
try:
result = subprocess.run(
["ssh-keygen", "-L", "-f", cert_path],
capture_output=True, text=True, timeout=5
)
if result.returncode != 0:
return False
output = result.stdout
# Check that the certificate is signed by the trusted CA
# and that the connecting username is in the principals list.
# A production implementation should parse the cert binary directly
# using a library like golang.org/x/crypto/ssh for stronger validation.
return (
f"Public key: {_ca_fingerprint(ca_pubkey_path)}" in output or
username in _extract_principals(output)
)
finally:
os.unlink(cert_path)
In practice, use a Go-based webhook for certificate validation — Go’s golang.org/x/crypto/ssh package parses SSH certificates natively and verifies CA signatures without shelling out.
Migrating from a Traditional Bastion
Migration is a DNS cutover combined with moving SSH key distribution to the auth webhook:
- Inventory current bastion users. Export all
authorized_keysentries. Map them to identities in your directory (LDAP, Active Directory, GitHub usernames). - Build the auth webhook. Import the public key fingerprints into your auth backend. Validate that every current user can authenticate against the webhook before cutting over.
- Deploy ContainerSSH in parallel. Run it on a different port or hostname. Have operators test access without changing the production bastion.
- Update DNS. Change the
bastion.example.comA record to point to the ContainerSSH host. Old bastion remains reachable atlegacy-bastion.example.comduring the transition window. - Communicate the host key change. Clients will see a new SSH host key (ContainerSSH’s key, not the old bastion’s). Distribute the new fingerprint or use an SSH CA host certificate so clients verify it automatically.
- Decommission the old bastion. After a stabilisation period (one to two weeks), remove the legacy bastion. Delete all its system user accounts and rotate any credentials that were stored on it.
Host Key Management
ContainerSSH’s host key is what SSH clients verify to confirm they are connecting to the legitimate bastion. It must be stable, securely stored, and backed up.
# Generate a dedicated host key for ContainerSSH.
# Use Ed25519 — compact, fast, strong.
ssh-keygen -t ed25519 -f /etc/containerssh/host_key -C "bastion.example.com" -N ""
# Store the private key in a secrets manager (Vault, AWS Secrets Manager, etc.)
# and retrieve it at service startup rather than leaving it on disk.
# Example: retrieve from HashiCorp Vault at startup
vault kv get -field=private_key secret/containerssh/host_key > /etc/containerssh/host_key
chmod 600 /etc/containerssh/host_key
# Distribute the public key fingerprint to operators' known_hosts,
# or sign it with your SSH CA so clients verify it automatically.
ssh-keygen -l -f /etc/containerssh/host_key.pub
# Output: 256 SHA256:xxxx... bastion.example.com (ED25519)
Add the fingerprint to operator workstations’ ~/.ssh/known_hosts, or better, use an SSH CA host certificate:
# Sign ContainerSSH's host key with your SSH CA
# (requires the host CA private key — see /articles/linux/ssh-certificate-authority/)
ssh-keygen -s /etc/ssh/ca/host_ca \
-I "bastion.example.com" \
-h \
-n "bastion.example.com,bastion,10.0.1.50" \
-V "+52w" \
/etc/containerssh/host_key.pub
# Creates /etc/containerssh/host_key-cert.pub
# Reference the cert in ContainerSSH config.yaml:
# ssh:
# hostkeys:
# - /etc/containerssh/host_key
# hostcerts:
# - /etc/containerssh/host_key-cert.pub
With a host certificate, clients trust the CA (one line in known_hosts) rather than individual host fingerprints. No known_hosts update is needed when the bastion’s IP changes.
Running ContainerSSH as a Systemd Service
# /etc/systemd/system/containerssh.service
[Unit]
Description=ContainerSSH — ephemeral container SSH gateway
After=network.target docker.service
Requires=docker.service
[Service]
Type=simple
User=containerssh
Group=containerssh
ExecStart=/usr/local/bin/containerssh --config /etc/containerssh/config.yaml
Restart=on-failure
RestartSec=5s
# Harden the service process itself.
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/etc/containerssh
CapabilityBoundingSet=
AmbientCapabilities=
[Install]
WantedBy=multi-user.target
# Create a dedicated service account (no login shell, no home directory).
useradd -r -s /usr/sbin/nologin -M -d /nonexistent containerssh
chown -R containerssh:containerssh /etc/containerssh
# Add to docker group to allow Docker socket access, OR use rootless Docker.
usermod -aG docker containerssh
systemctl daemon-reload
systemctl enable --now containerssh
systemctl status containerssh
Expected Behaviour
| Scenario | ContainerSSH Behaviour | Security Outcome |
|---|---|---|
| User connects with valid SSH key | Auth webhook called → returns success → container launched from specified image → user dropped into shell | Fresh isolated container, no shared state with prior sessions |
| User disconnects (clean exit) | ContainerSSH signals backend to remove container immediately | Container and all in-container data destroyed; no residual state |
| User disconnects (ungraceful: network drop) | ContainerSSH detects TCP close or SSH keepalive timeout → signals backend to remove container | Container removed even without clean session termination |
| Attacker compromises the session container | Attacker is inside one container’s namespace; container has no persistent storage, read-only root FS, non-root user | Blast radius limited to that container’s lifetime; no access to other sessions or bastion OS without additional exploit |
| Auth webhook returns failure | SSH connection rejected at the authentication stage; no container launched | Zero-trust enforcement: no backend access without explicit auth approval |
| Auth webhook is unavailable | All SSH connections fail (ContainerSSH cannot authenticate without a webhook response) | Fail-closed behaviour; no unauthenticated access; requires webhook HA for production |
| Container backend (Docker) unreachable | Auth webhook may succeed; container launch fails; SSH connection dropped with error | User cannot connect; alert on backend errors; ensures no half-open sessions |
| Container image pull fails | Container launch fails; SSH connection dropped | User cannot connect; pre-pull images on the host to avoid runtime pull delays |
| Session timeout (idle) | Configure ssh.clientAliveInterval and ssh.clientAliveCountMax; ContainerSSH closes connection and removes container |
Idle sessions do not persist indefinitely; containers reclaimed automatically |
Trade-offs
| Trade-off | Implication | Mitigation |
|---|---|---|
| Stateless sessions — no persistent work directory | Operators cannot leave files between sessions; any work-in-progress is lost on disconnect | Mount a network volume (NFS, S3FS) into the container via the auth webhook config override; scope it per-user |
| Container startup latency | Each SSH connection waits for a container to start (typically 0.5–3 s with a pre-pulled image, longer on cold pull) | Pre-pull the session image on the ContainerSSH host; use a slim image; accept latency as a security trade-off |
| Webhook as a single point of failure | If the auth webhook is down, all SSH access is blocked — harder to troubleshoot under incident conditions | Run multiple webhook instances behind a load balancer; implement a local fallback (break-glass account on the bastion OS itself, distinct from ContainerSSH) |
| Container image maintenance burden | The session image must be patched for OS CVEs like any other container image | Add the session image to your container scanning pipeline; automate rebuilds on base image updates |
| Container escape risk | A container breakout exploit would give the attacker access to the bastion OS | Use a hardened container runtime (gVisor, Kata Containers) for higher-assurance deployments; keep the bastion OS minimal |
| No persistent audit trail inside container | Session activity inside the container is not automatically captured | Use a session recorder (ContainerSSH has built-in audit log support); ship logs to a centralised SIEM before container removal |
| Auth webhook must understand SSH pubkey formats | Validating SSH public keys and certificates requires SSH-specific parsing logic | Use a Go-based webhook with golang.org/x/crypto/ssh; avoid reinventing certificate validation |
Failure Modes
| Failure Mode | Symptom | Immediate Impact | Resolution |
|---|---|---|---|
| Auth webhook down | SSH connections hang then time out (configurable timeout, default 5 s) | All SSH access blocked — no sessions can be established | Deploy webhook as HA service (2+ replicas); monitor webhook health endpoint; maintain break-glass access via separate mechanism |
| Container backend unreachable (Docker socket gone, Kubernetes API unavailable) | Auth succeeds but container launch fails; SSH client receives connection error | No new sessions; existing sessions unaffected (containers already running) | Monitor Docker/Kubernetes API health; alert on ContainerSSH launch errors; restart Docker daemon or restore API connectivity |
| SSH host key lost or rotated unexpectedly | Clients see WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED and refuse to connect |
All client connections rejected until known_hosts updated | Back up host key in a secrets manager; use SSH CA host certificates so clients trust the CA, not individual keys; document rotation procedure |
| Session container OOM (Out of Memory) | Container is killed by the OOM killer; SSH session drops | User’s session is terminated abruptly; no data loss beyond in-session work | Set appropriate memory limits in config; monitor container metrics; alert on OOM events; resize limits if legitimate |
| Container not cleaned up on ungraceful disconnect | Container continues running after TCP session drops; resources consumed until ContainerSSH detects timeout | Wasted compute; potential data exposure if container has mounted volumes | Configure SSH keepalive aggressively (clientAliveInterval 30, clientAliveCountMax 3); monitor for orphaned containers; ContainerSSH’s cleanup goroutine handles most cases |
| Auth webhook returns wrong result (false positive) | Unauthorised user gains SSH access; correct user is denied | Security control failure; potential unauthorised access | Add integration tests to auth webhook; log all decisions; alert on unexpected access patterns; review webhook code as security-critical |
| Config webhook returns malformed container spec | Container launch fails; SSH connection dropped | User cannot connect; may affect all users if using shared config endpoint | Validate webhook responses in CI; test config webhook separately from auth webhook; fall back to static config if config webhook is optional |
| ContainerSSH process crashes | SSH port becomes unreachable | All SSH access blocked | Run under systemd with Restart=on-failure; monitor port availability; alert on service restarts |
Related Articles
- SSH Certificate Authority: Short-Lived User Certificates and Host Verification — set up the SSH CA whose certificates ContainerSSH’s auth webhook can validate, and whose host certificate eliminates the bastion host-key trust-on-first-use problem
- SSH Bastion Host Hardening — if ContainerSSH is not yet an option, these controls reduce the attack surface of a traditional bastion
- Zero Trust Architecture Principles — the architectural principles behind why ephemeral, just-in-time access models reduce lateral movement risk
- ContainerSSH Webhook Auth Hardening — hardening the auth webhook itself: mTLS between ContainerSSH and the webhook, rate limiting, and tamper-evident audit logging
- ContainerSSH Kubernetes Backend — using Kubernetes as the container backend instead of Docker, with Pod security policies, namespace isolation per session, and integration with Kubernetes RBAC