SPIFFE and SPIRE: Cryptographic Workload Identity for Zero Trust Kubernetes

Problem

Service-to-service authentication in Kubernetes defaults to one of three failure modes: shared static secrets rotated infrequently (or never), IP-based access controls that break the moment pod IPs churn, or ambient network trust where anything inside the cluster can talk to anything else. None of these reflect the actual security boundary you want: this specific workload, attested by the platform, holds a cryptographic credential proving its identity, and that credential expires in hours — not years.

SPIFFE (Secure Production Identity Framework for Everyone) defines the standard. SPIRE (SPIFFE Runtime Environment) implements it. Together they give every workload a short-lived X.509 certificate or JWT token carrying a URI that identifies the workload exactly — not its IP, not its service account name, but a structured identity bound to its attestable properties at the time of issuance.

The gaps SPIFFE/SPIRE close:

Long-lived static secrets shared across workloads — one extraction compromises the whole environment.
IP-based mTLS with no workload attestation — an attacker on any pod with the right source IP passes as a legitimate caller.
No cross-cluster identity — federated workloads have no common trust mechanism without bespoke integration.
Certificate issuance tied to human operators — every rotation is a change request, so TTLs expand to months to avoid alert fatigue.

Target systems: SPIRE v1.9+, Kubernetes 1.29+, Envoy 1.29+, Istio 1.21+.

Threat Model

Adversary 1 — Stolen service credentials: An attacker extracts a long-lived API key or certificate from a compromised pod. With a 1-year TTL, they have persistent access to any service that trusts that credential.
Adversary 2 — Impersonation via network position: A compromised pod claims to be a payment service by spoofing its source IP or presenting a legitimate cluster service account token to an internal API that doesn’t verify caller identity.
Adversary 3 — Cross-cluster lateral movement: An attacker who compromises a workload in a staging cluster attempts to use its identity to call production services that blindly trust any credential issued by the organization’s internal CA.
Adversary 4 — Node compromise for mass impersonation: An attacker gaining node-level access attempts to fabricate SVID requests for any workload running on that node.
Access level: Adversaries 1–3 have pod-level execution. Adversary 4 has node-level execution (container escape or DaemonSet compromise).
Objective: Obtain valid credentials to impersonate high-value services, exfiltrate data, or move laterally across clusters.
Blast radius with SPIRE: X.509 SVIDs default to 1-hour TTL. A stolen SVID expires within the hour. Workload attestation binds credentials to provable platform properties — fabricating an SVID requires compromising the SPIRE Agent and forging its node attestation, which leaves audit artifacts.

SPIFFE Identity Model

SPIFFE URI Format

Every SPIFFE identity is a URI of the form:

spiffe://<trust-domain>/<workload-path>

Examples in a multi-cluster environment:

spiffe://prod.example.com/ns/payments/sa/payment-api
spiffe://prod.example.com/ns/auth/sa/token-service
spiffe://staging.example.com/ns/payments/sa/payment-api
spiffe://partner.external.com/service/inbound-webhook

The trust domain (prod.example.com) is the root of trust — all SVIDs issued under it are signed by the same CA bundle. Federation allows workloads in prod.example.com to verify SVIDs from partner.external.com by exchanging trust bundles out-of-band.

The path component is arbitrary but should encode the minimal identity properties needed for authorization policy. Kubernetes deployments conventionally use ns/<namespace>/sa/<service-account>, but you can encode environment, region, or role into the path for fine-grained policy.

SVID Types

X.509 SVID is a standard X.509 certificate with the SPIFFE URI embedded in the Subject Alternative Name (SAN) field as a URI SAN. The certificate is signed by the trust domain’s CA. The workload uses it for mTLS — both sides present their X.509 SVID, verify the other’s certificate chain against the trust bundle, and extract the SPIFFE URI from the peer’s SAN for authorization decisions.

Subject: CN=payment-api
X509v3 Subject Alternative Name:
    URI:spiffe://prod.example.com/ns/payments/sa/payment-api
Validity:
    Not Before: May  9 08:00:00 2026 GMT
    Not After : May  9 09:00:00 2026 GMT  (1-hour TTL)

JWT-SVID is a signed JWT where the sub claim carries the SPIFFE URI and the aud claim specifies the target service. Use JWT-SVIDs when the transport doesn’t support mutual TLS — REST APIs where the caller adds the token as a Bearer header, or messaging systems where you attach the JWT as metadata.

{
  "sub": "spiffe://prod.example.com/ns/payments/sa/payment-api",
  "aud": ["spiffe://prod.example.com/ns/auth/sa/token-service"],
  "exp": 1746784800,
  "iat": 1746781200
}

JWT-SVIDs have shorter recommended TTLs (5 minutes) because they can’t be revoked and aren’t bound to a TLS session. Use X.509 SVIDs for service mesh mTLS and JWT-SVIDs only when HTTP-level bearer tokens are required.

SPIRE Architecture

┌─────────────────────────────────────────────────┐
│  SPIRE Server (StatefulSet, HA)                 │
│  ┌──────────────┐  ┌──────────────────────────┐ │
│  │  CA Plugin   │  │  Datastore (SQLite/      │ │
│  │  (disk/Vault)│  │  PostgreSQL/etcd)        │ │
│  └──────────────┘  └──────────────────────────┘ │
│  ┌────────────────────────────────────────────┐  │
│  │  Registration API  │  Node API             │  │
│  └────────────────────────────────────────────┘  │
└──────────────────────┬──────────────────────────┘
                       │ node attestation
                       │ SVID issuance
┌──────────────────────▼──────────────────────────┐
│  SPIRE Agent (DaemonSet, one per node)          │
│  ┌──────────────┐  ┌──────────────────────────┐ │
│  │  Node        │  │  Workload Attestor        │ │
│  │  Attestor    │  │  (k8s, unix, docker)      │ │
│  │  (k8s_sat)   │  └──────────────────────────┘ │
│  └──────────────┘                               │
│  ┌────────────────────────────────────────────┐  │
│  │  Workload API (Unix domain socket)         │  │
│  └────────────────────────────────────────────┘  │
└──────────────────────┬──────────────────────────┘
                       │ /run/spire/sockets/agent.sock
         ┌─────────────┴──────────────┐
         ▼                            ▼
   Payment API Pod              Auth Service Pod
   (fetches X.509 SVID)         (fetches X.509 SVID)

SPIRE Server is the trust anchor. It holds the signing CA (or delegates to Vault), stores registration entries (the policy mapping workload selectors to SPIFFE IDs), and issues SVIDs to authenticated agents. Run it as a StatefulSet with persistent storage for HA.

SPIRE Agent runs as a DaemonSet on every node. On startup it attests itself to the server — proving it’s running on a legitimate Kubernetes node by presenting a Kubernetes service account token (the k8s_sat attestor). Once attested, the agent caches SVIDs locally and serves them to workloads via the Workload API socket.

Workload API is a gRPC API exposed over a Unix domain socket mounted into workload pods. Workloads call FetchX509SVID or FetchJWTSVID. The agent performs workload attestation on each call — it inspects /proc/<pid>/cgroup, calls the Kubernetes API to verify pod identity, and matches the result against registered selectors.

Deploying SPIRE with Helm

Add the SPIRE Helm chart repository and install server and agent in a dedicated namespace.

helm repo add spiffe https://spiffe.github.io/helm-charts-hardened/
helm repo update

Create a values file for the server:

# spire-server-values.yaml
global:
  openshift: false
  spire:
    trustDomain: "prod.example.com"
    clusterName: "prod-east"

spire-server:
  replicaCount: 3
  ha:
    enabled: true
    postgresConfig:
      host: "postgres.spire.svc.cluster.local"
      port: 5432
      dbName: "spire"
      secretName: "spire-postgres-creds"

  caKeyType: "ec-p384"
  caTTL: "24h"
  defaultX509SvidTTL: "1h"
  defaultJwtSvidTTL: "5m"

  nodeAttestor:
    k8sPsat:
      enabled: true
      serviceAccountAllowList:
        - "spire:spire-agent"

  keyManager:
    disk:
      enabled: false
    awsKms:
      enabled: false

  upstreamAuthority:
    vault:
      enabled: true
      vaultAddr: "https://vault.internal:8200"
      namespace: "pki"
      pkiPath: "pki/spire"
      insecureSkipVerify: false

  tornjak:
    enabled: true
    replicaCount: 1
    image:
      registry: ghcr.io
      repository: spiffe/tornjak-backend
      tag: "v1.4.2"

# spire-agent-values.yaml
spire-agent:
  server:
    address: "spire-server.spire.svc.cluster.local"
    port: 8081

  workloadAttestors:
    k8s:
      enabled: true
      skipKubeletVerification: false
      nodeNameEnv: "MY_NODE_NAME"

  socketPath: "/run/spire/sockets/agent.sock"

  hostSocketDir: "/run/spire/sockets"

  sds:
    enabled: true
    defaultSvidName: "default"
    defaultBundleName: "ROOTCA"
    defaultAllBundlesName: "ALL"

Deploy:

helm install spire spiffe/spire \
  --namespace spire \
  --create-namespace \
  --values spire-server-values.yaml \
  --values spire-agent-values.yaml \
  --wait

Kubernetes Attestation

Node Attestation with k8s_sat

When a SPIRE Agent starts on a node, it must prove to the SPIRE Server that it’s a legitimate node in the cluster. The k8s_sat (Kubernetes Service Account Token) attestor does this:

The agent generates a one-time challenge nonce.
The agent requests a projected service account token from the kubelet, bound to the nonce.
The agent sends the token to the SPIRE Server.
The SPIRE Server calls the Kubernetes TokenReview API to validate the token.
If valid, the server creates a node SVID for that agent with SPIFFE ID spiffe://prod.example.com/spire/agent/k8s_sat/prod-east/<node-uid>.

The server-side configuration for node attestation:

NodeAttestor "k8s_psat" {
  plugin_data {
    clusters = {
      "prod-east" = {
        service_account_allow_list = ["spire:spire-agent"]
        kube_config_file           = ""
        allowed_pod_label_keys     = []
        allowed_node_label_keys    = []
      }
    }
  }
}

Workload Attestation by Pod Selectors

Once a node is attested, the agent handles workload attestation. When a pod calls the Workload API socket, the agent:

Identifies the calling process via the Unix socket credentials (UID/PID).
Resolves the PID to a pod via /proc/<pid>/cgroup and the container runtime.
Calls the Kubernetes API to fetch the pod’s labels, namespace, service account, and annotations.
Matches the pod’s properties against registered selectors.

Registration entries map selectors to SPIFFE IDs. Create entries with the SPIRE CLI:

# Payment API: matched by namespace + service account
kubectl exec -n spire deploy/spire-server -- \
  /opt/spire/bin/spire-server entry create \
  -spiffeID spiffe://prod.example.com/ns/payments/sa/payment-api \
  -parentID spiffe://prod.example.com/spire/agent/k8s_psat/prod-east/\* \
  -selector k8s:ns:payments \
  -selector k8s:sa:payment-api \
  -ttl 3600

# Privileged internal service: additional pod label selector
kubectl exec -n spire deploy/spire-server -- \
  /opt/spire/bin/spire-server entry create \
  -spiffeID spiffe://prod.example.com/ns/auth/sa/token-service \
  -parentID spiffe://prod.example.com/spire/agent/k8s_psat/prod-east/\* \
  -selector k8s:ns:auth \
  -selector k8s:sa:token-service \
  -selector k8s:pod-label:security-tier:critical \
  -ttl 3600

Multiple selectors on a single entry are evaluated as AND conditions — all must match for the workload to receive the SVID. An attacker running a rogue pod in the payments namespace with a different service account won’t match the payment-api entry.

Automatic Certificate Rotation

The SPIRE Agent caches SVIDs locally and proactively rotates them. The rotation cycle:

Agent fetches an SVID with a 1-hour TTL from the server.
At 50% of TTL (30 minutes), the agent generates a new key pair and sends a CSR to the server.
The server signs the new SVID and returns it. The agent caches both old and new SVIDs briefly.
The Workload API starts serving the new SVID. The workload’s SPIFFE-aware library (spiffe-go, java-spiffe) detects the new SVID via the streaming WatchX509SVIDs call and updates the TLS configuration in memory — no process restart, no connection drain.
After a grace period, the old SVID is discarded.

The streaming Workload API is critical here. Libraries that implement the SPIFFE Workload API spec hold an open gRPC stream to the agent. The agent pushes SVID updates to the stream. The library hot-swaps the certificate in the TLS listener/dialer. The application code sees no interruption.

For libraries that don’t support streaming rotation, mount the SVID as a file and configure inotify-based reload. The SPIFFE CSI driver (csi.spiffe.io) handles this:

volumes:
  - name: spiffe-workload-api
    csi:
      driver: "csi.spiffe.io"
      readOnly: true
containers:
  - name: payment-api
    volumeMounts:
      - name: spiffe-workload-api
        mountPath: /run/spiffe/bundle
        readOnly: true

The CSI driver writes the X.509 SVID, private key, and trust bundle to the mounted path and updates the files in-place on rotation.

SPIFFE Federation Across Kubernetes Clusters

Federation allows workloads in different trust domains to verify each other’s SVIDs. The mechanism: each SPIRE Server exposes a bundle endpoint (an HTTPS endpoint serving its trust bundle — the set of root certificates for its trust domain). Each server fetches the remote bundle on a refresh interval and stores it locally. Workload attestation then allows SVIDs from the remote trust domain as valid peer credentials.

Configure Bundle Endpoints

In the SPIRE Server config:

federation {
  bundle_endpoint {
    address = "0.0.0.0"
    port    = 8443
    acme {
      domain_name = "spire-federation.prod.example.com"
      email       = "platform@example.com"
      tos_accepted = true
    }
  }

  federates_with "staging.example.com" {
    bundle_endpoint_url     = "https://spire-federation.staging.example.com:8443"
    bundle_endpoint_profile "https_spiffe" {
      endpoint_spiffe_id = "spiffe://staging.example.com/spire/server"
    }
  }
}

Expose the bundle endpoint via a LoadBalancer service or ingress. The server at staging.example.com fetches this endpoint to learn your trust bundle.

Register Federated Entries

To allow a staging workload to call a production service, create a registration entry in the production SPIRE Server that specifies the federated trust domain:

kubectl exec -n spire deploy/spire-server -- \
  /opt/spire/bin/spire-server entry create \
  -spiffeID spiffe://prod.example.com/ns/payments/sa/payment-api \
  -parentID spiffe://prod.example.com/spire/agent/k8s_psat/prod-east/\* \
  -selector k8s:ns:payments \
  -selector k8s:sa:payment-api \
  -federatesWith spiffe://staging.example.com \
  -ttl 3600

The resulting X.509 SVID will include the staging trust domain’s bundle alongside the production bundle. When the staging workload presents its SVID to the production service, the production service can verify it against the federated bundle.

See zero trust architecture principles for the policy model that governs what federated identities are permitted to do once verified.

Integrating with Envoy via SDS

Envoy’s Secret Discovery Service (SDS) protocol allows SPIRE to push TLS certificates and trust bundles to Envoy dynamically, without file mounts or process restarts. The SPIRE Agent exposes an SDS-compatible endpoint on the same Unix socket as the Workload API.

Configure Envoy to use the SPIRE agent socket for SDS:

# envoy-bootstrap.yaml (relevant sections)
static_resources:
  clusters:
    - name: spire_agent
      connect_timeout: 0.25s
      http2_protocol_options: {}
      load_assignment:
        cluster_name: spire_agent
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    pipe:
                      path: /run/spire/sockets/agent.sock

dynamic_resources:
  ads_config:
    api_type: GRPC
    transport_api_version: V3
    grpc_services:
      - envoy_grpc:
          cluster_name: spire_agent

Reference SPIRE-managed secrets in the Envoy listener configuration:

filter_chains:
  - transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
        require_client_certificate: true
        common_tls_context:
          tls_certificate_sds_secret_configs:
            - name: "spiffe://prod.example.com/ns/payments/sa/payment-api"
              sds_config:
                api_config_source:
                  api_type: GRPC
                  transport_api_version: V3
                  grpc_services:
                    - envoy_grpc:
                        cluster_name: spire_agent
          combined_validation_context:
            default_validation_context:
              match_typed_subject_alt_names:
                - san_type: URI
                  matcher:
                    prefix: "spiffe://prod.example.com/"
            validation_context_sds_secret_config:
              name: "ROOTCA"
              sds_config:
                api_config_source:
                  api_type: GRPC
                  transport_api_version: V3
                  grpc_services:
                    - envoy_grpc:
                        cluster_name: spire_agent

Envoy holds an open gRPC stream to the SPIRE Agent. When the agent rotates the SVID, it pushes the new certificate to Envoy over the stream. Envoy hot-swaps the certificate without dropping existing connections. The combined_validation_context validates peer certificates against the SPIRE-managed trust bundle (also updated dynamically) and enforces that the peer’s SAN URI matches the expected SPIFFE ID prefix.

See service mesh security for patterns on building authorization policy on top of mTLS identity.

Integrating with Istio: Replacing Citadel

Istio’s default Citadel component issues certificates for sidecar proxies. Citadel uses a custom CA backed by a Secret in the istio-system namespace — an opaque key that any cluster-admin can read. Replace Citadel’s certificate issuance with SPIRE to get platform-attested workload identities with short TTLs and federation support.

Configure Istio to Use SPIRE

Patch Istio to use the SPIRE Workload API instead of Citadel:

# istio-operator.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: istio
  namespace: istio-system
spec:
  meshConfig:
    defaultConfig:
      proxyMetadata:
        PROXY_CONFIG_XDS_AGENT: "true"

  values:
    global:
      caAddress: "unix:///run/spire/sockets/agent.sock"

    pilot:
      env:
        ENABLE_CA_SERVER: "false"
        PILOT_CERT_PROVIDER: "spire"

  components:
    pilot:
      k8s:
        overlays:
          - apiVersion: apps/v1
            kind: Deployment
            name: istiod
            patches:
              - path: spec.template.spec.volumes[name:spire-agent-socket]
                value:
                  name: spire-agent-socket
                  hostPath:
                    path: /run/spire/sockets
                    type: DirectoryOrCreate
              - path: spec.template.spec.containers[name:discovery].volumeMounts[name:spire-agent-socket]
                value:
                  name: spire-agent-socket
                  mountPath: /run/spire/sockets
                  readOnly: true

kubectl exec -n spire deploy/spire-server -- \
  /opt/spire/bin/spire-server entry create \
  -spiffeID spiffe://prod.example.com/ns/payments/sa/payment-api \
  -parentID spiffe://prod.example.com/spire/agent/k8s_psat/prod-east/\* \
  -selector k8s:ns:payments \
  -selector k8s:sa:payment-api \
  -ttl 3600

With this configuration, Istio’s Envoy sidecar fetches its certificate from the SPIRE Agent socket rather than from istiod’s built-in CA. The SVID’s URI SAN carries the SPIFFE ID, which Istio’s AuthorizationPolicy can match against:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: payment-api-ingress
  namespace: payments
spec:
  selector:
    matchLabels:
      app: payment-api
  rules:
    - from:
        - source:
            principals:
              - "cluster.local/ns/frontend/sa/web-app"
      to:
        - operation:
            methods: ["POST"]
            paths: ["/v1/charge"]

The principals field maps to the SPIFFE URI after stripping the spiffe:// prefix and trust domain — Istio normalizes this internally. The policy denies any caller whose SVID doesn’t carry the exact web-app service account identity, regardless of source IP or namespace.

Operational Considerations

SPIRE Server HA requires shared storage. For production, use PostgreSQL as the datastore and store the CA signing key in a KMS (AWS KMS, GCP Cloud HSM, or HashiCorp Vault). With disk-backed CA key storage, losing the server StatefulSet’s persistent volume loses the CA — all agents must re-attest and all existing SVIDs become unverifiable.

Registration entry management at scale becomes unwieldy with manual CLI commands. Use the SPIRE Controller Manager, which watches Kubernetes ClusterSPIFFEID and SPIFFEIDm custom resources and reconciles registration entries automatically:

apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterSPIFFEID
metadata:
  name: payments-workloads
spec:
  spiffeIDTemplate: "spiffe://prod.example.com/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}"
  podSelector:
    matchLabels:
      spiffe-managed: "true"
  ttl: "1h"
  federatesWith:
    - "spiffe://staging.example.com"

Audit logging from the SPIRE Server records every SVID issuance with the requesting agent’s node SVID and the matched registration entry. Ship these logs to your SIEM and alert on issuances that don’t match expected selector patterns — a node SVID issuing SVIDs for workloads it has never issued for before is an anomaly worth investigating.

Trust bundle rotation happens automatically within a trust domain but requires manual coordination for federated bundles. When rotating the root CA for a trust domain, publish the new bundle to the federation endpoint before removing the old one, and ensure all federated partners have fetched the new bundle before the old root expires. SPIRE’s bundle refresh interval defaults to 5 minutes — plan CA rotations with at least a 10-minute overlap window.

The combination of SPIFFE’s attestation-backed identity with sub-hour certificate TTLs means the window for credential abuse shrinks from months (static secrets) to the rotation interval. That’s the concrete security improvement SPIRE delivers over any secret-distribution approach that doesn’t bind credential lifetime to workload attestation.