Kubernetes Subresource RBAC Escalation: Restricting exec, portforward, and proxy

Kubernetes Subresource RBAC Escalation: Restricting exec, portforward, and proxy

Problem

Kubernetes RBAC subresources are verbs on sub-paths of standard resources. pods/exec grants the ability to run arbitrary commands inside any pod. pods/portforward grants the ability to open a TCP tunnel to any port on any pod. nodes/proxy grants the ability to send arbitrary HTTP requests to the Kubelet API on any node. pods/log grants read access to stdout/stderr of every container across the cluster.

These are not restricted actions on a safe API — they are interactive access primitives that bypass application-level authentication entirely. A principal with pods/exec on all pods does not need to know any application password, service account token, or encryption key stored in a pod’s environment. They can simply exec in, read from memory, copy files, and exfiltrate. The result is indistinguishable from direct root access to the container.

Despite this, these subresource permissions are systematically over-provisioned in most clusters:

Developer onboarding shortcuts. Many organizations grant pods/exec cluster-wide to developers to make debugging convenient. The reasoning is “they could just scale down and replace the pod anyway” — but this conflates the ability to change cluster state with the ability to read live process memory and in-flight network traffic, which are categorically different.

Helm chart and operator defaults. The cluster-admin ClusterRole and many Helm-chart-generated roles include pods/exec and pods/portforward in their default definitions, either because the authors needed it during development or because they didn’t audit subresource grants specifically.

nodes/proxy in monitoring roles. Prometheus and many monitoring tools need to scrape Kubelet metrics. Some integrations use nodes/proxy to reach the /metrics endpoint on Kubelet directly — a capability that also allows reaching /exec, /run, /portForward, and /logs endpoints on the Kubelet API without going through the kube-apiserver, bypassing audit logging entirely.

pods/log for log aggregation. Log collection agents are sometimes granted pods/log cluster-wide. This means any compromise of the log agent pod yields read access to every container’s stdout/stderr across the entire cluster — including application secrets, session tokens, and private keys that applications inadvertently log.

A concrete escalation chain: attacker compromises a developer workstation → steals kubeconfig with pods/exec on all namespaces → execs into a secrets-management pod → reads Vault unsealing tokens → full secrets infrastructure compromise. The kube-apiserver audit log records the exec, but it fires after the fact and the compromise is already complete.

The nodes/proxy case is especially subtle. The Kubelet API is not fully subject to the same RBAC model as the kube-apiserver. A principal with nodes/proxy can reach Kubelet endpoints that pre-date RBAC and may have weaker authentication requirements depending on the cluster configuration. CVE-2023-2431 (Kubelet host-mount escape) was exploitable via this path.

Target systems: Kubernetes 1.21–1.32 on all managed and self-managed distributions; any cluster where RBAC was set up incrementally rather than from a designed least-privilege baseline; clusters migrated from older Kubernetes versions that predate modern RBAC awareness.


Threat Model

Adversary 1 — Compromised developer credential. Access level: kubeconfig or service account token belonging to a developer with pods/exec on production namespaces. Objective: exec into a pod processing payment data, read application memory or environment variables, exfiltrate secrets without any application-layer authentication.

Adversary 2 — Compromised CI/CD service account. Access level: service account used by a deployment pipeline with over-provisioned RBAC inherited from a cluster-admin template. Objective: exec into running pods to extract current secrets, modify in-flight requests, or pivot to other namespaces.

Adversary 3 — Compromised monitoring agent. Access level: a Prometheus pod with nodes/proxy cluster-wide. Objective: use the Kubelet proxy endpoint to send /run requests that execute commands on nodes without going through kube-apiserver audit logging.

Adversary 4 — Internal namespace escalation. Access level: a pod in a low-privilege namespace with namespace-scoped pods/exec. Objective: exec into a higher-privilege pod that runs in the same namespace (e.g., a secrets injector sidecar), read its token, and use it to access other namespaces.

Without controls: any credential with subresource grants is effectively a master key to every pod it can target. With controls: JIT access limits the window; Kyverno policies enforce subresource restrictions; audit logging provides detection coverage.


Configuration / Implementation

Step 1 — Audit who currently holds dangerous subresource grants

#!/bin/bash
# audit-subresource-rbac.sh
# Find all principals with dangerous subresource permissions

DANGEROUS_SUBRESOURCES=("pods/exec" "pods/portforward" "pods/log" "nodes/proxy" "pods/attach")

echo "=== Scanning ClusterRoles ==="
for subresource in "${DANGEROUS_SUBRESOURCES[@]}"; do
  resource=$(echo "$subresource" | cut -d/ -f1)
  sub=$(echo "$subresource" | cut -d/ -f2)
  echo "--- $subresource ---"
  kubectl get clusterroles -o json | jq -r --arg r "$resource" --arg s "$sub" '
    .items[] | 
    . as $role |
    .rules[]? |
    select(
      (.resources[]? | test($r)) and
      (.resources[]? | test($s)) and
      (.verbs[]? | test("get|create|\\*"))
    ) |
    $role.metadata.name
  ' | sort -u
done

echo ""
echo "=== Bindings for cluster-admin (includes all subresources) ==="
kubectl get clusterrolebindings -o json | jq -r '
  .items[] |
  select(.roleRef.name == "cluster-admin") |
  "\(.metadata.name): \(.subjects[]? | "\(.kind)/\(.name)")"
'
bash audit-subresource-rbac.sh 2>/dev/null

# Also check namespace-scoped roles
for ns in $(kubectl get namespaces -o jsonpath='{.items[*].metadata.name}'); do
  echo "=== $ns ==="
  kubectl get rolebindings -n "$ns" -o json | jq -r '
    .items[] | "\(.roleRef.name): \(.subjects[]? | "\(.kind)/\(.name)")"
  '
done

Step 2 — Remove subresource grants from standing roles

For each identified role with unnecessary subresource grants, create a replacement without them:

# Replace an over-provisioned developer role
# Before (problematic):
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: developer-access-REPLACE-THIS
rules:
- apiGroups: [""]
  resources: ["pods", "pods/exec", "pods/portforward", "pods/log"]
  verbs: ["*"]

---
# After (scoped):
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: developer-access
  annotations:
    security.example.com/review-date: "2026-05-12"
    security.example.com/owner: "platform-team"
rules:
# Standard pod operations — no exec, portforward, or log
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: []  # Removed — use centralized logging instead
# Deployments for rollout management
- apiGroups: ["apps"]
  resources: ["deployments", "replicasets"]
  verbs: ["get", "list", "watch", "update", "patch"]

For Prometheus, replace nodes/proxy with a targeted scrape configuration:

# Prometheus role — metrics only, no node proxy
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-scraper
rules:
# Scrape pod and service metrics via endpoints — no nodes/proxy needed
- apiGroups: [""]
  resources: ["nodes/metrics"]  # metrics-only subresource, not full proxy
  verbs: ["get"]
- apiGroups: [""]
  resources: ["pods", "services", "endpoints"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["extensions", "networking.k8s.io"]
  resources: ["ingresses"]
  verbs: ["get", "list", "watch"]

Step 3 — Enforce subresource restrictions with Kyverno

Prevent new roles from gaining dangerous subresource grants without explicit annotation:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-dangerous-subresources
  annotations:
    policies.kyverno.io/title: Restrict Dangerous RBAC Subresources
    policies.kyverno.io/description: >
      Prevents ClusterRoles and Roles from granting pods/exec, pods/portforward,
      or nodes/proxy without an explicit security review annotation.
spec:
  validationFailureAction: Enforce
  background: true
  rules:
  - name: deny-exec-without-review
    match:
      any:
      - resources:
          kinds: [ClusterRole, Role]
    validate:
      message: >
        pods/exec, pods/portforward, and nodes/proxy require annotation
        security.example.com/subresource-review=approved
      deny:
        conditions:
          all:
          - key: "{{ request.object.rules[].resources[] | contains(@, 'pods/exec') || contains(@, 'pods/portforward') || contains(@, 'nodes/proxy') }}"
            operator: Equals
            value: true
          - key: "{{ request.object.metadata.annotations.\"security.example.com/subresource-review\" || '' }}"
            operator: NotEquals
            value: "approved"

Step 4 — Implement JIT access for exec and portforward

Replace standing pods/exec grants with just-in-time access using temporary RBAC bindings with TTLs:

#!/bin/bash
# jit-exec-access.sh
# Grant temporary pods/exec access with automatic expiry

REQUESTER=$1
NAMESPACE=$2
TTL_MINUTES=${3:-30}
REQUEST_TICKET=${4:-"TICKET-UNKNOWN"}  # Require a ticket reference

if [[ -z "$REQUESTER" || -z "$NAMESPACE" ]]; then
  echo "Usage: $0 <username> <namespace> [ttl-minutes] [ticket-id]"
  exit 1
fi

BINDING_NAME="jit-exec-${REQUESTER}-$(date +%s)"
EXPIRY=$(date -u -d "+${TTL_MINUTES} minutes" '+%Y-%m-%dT%H:%M:%SZ')

# Create a role with exec access in the target namespace
kubectl create role "${BINDING_NAME}" \
  --namespace="${NAMESPACE}" \
  --verb=create \
  --resource=pods/exec \
  --dry-run=client -o json | \
  jq --arg expiry "$EXPIRY" --arg ticket "$REQUEST_TICKET" '
    .metadata.annotations += {
      "rbac.authorization.kubernetes.io/autoupdate": "false",
      "security.example.com/expires-at": $expiry,
      "security.example.com/ticket": $ticket
    }
  ' | kubectl apply -f -

# Bind to the requester
kubectl create rolebinding "${BINDING_NAME}" \
  --namespace="${NAMESPACE}" \
  --role="${BINDING_NAME}" \
  --user="${REQUESTER}"

echo "Granted pods/exec in ${NAMESPACE} to ${REQUESTER}"
echo "Expires at: ${EXPIRY}"
echo "Ticket: ${REQUEST_TICKET}"
echo ""
echo "Access command:"
echo "  kubectl exec -it <pod-name> -n ${NAMESPACE} -- /bin/sh"
echo ""
echo "To revoke early:"
echo "  kubectl delete role,rolebinding ${BINDING_NAME} -n ${NAMESPACE}"

# Schedule cleanup
at "${EXPIRY}" <<EOF
kubectl delete role,rolebinding "${BINDING_NAME}" -n "${NAMESPACE}" 2>/dev/null
echo "JIT access for ${REQUESTER} in ${NAMESPACE} expired (ticket: ${REQUEST_TICKET})"
EOF

For production use, integrate with your ticketing system and approval workflow:

# Kyverno policy: require audit annotation on any ClusterRoleBinding 
# that includes principals receiving exec access
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-jit-annotation-for-exec-bindings
spec:
  validationFailureAction: Audit
  rules:
  - name: annotate-exec-bindings
    match:
      any:
      - resources:
          kinds: [RoleBinding, ClusterRoleBinding]
    validate:
      message: "Bindings for roles with exec access must have a ticket annotation"
      pattern:
        metadata:
          annotations:
            security.example.com/ticket: "?*"

Step 5 — Restrict nodes/proxy via NetworkPolicy and Kubelet auth

Remove nodes/proxy from all non-system roles and enforce direct Kubelet authentication:

# Ensure Kubelet requires authentication for all endpoints
# Check current configuration
kubectl get configmap kubelet-config -n kube-system -o yaml | \
  grep -E "authentication|authorization"

# Kubelet config should have:
# authentication:
#   anonymous:
#     enabled: false    # No unauthenticated access
#   webhook:
#     enabled: true     # Use kube-apiserver for AuthN
# authorization:
#   mode: Webhook       # Use kube-apiserver RBAC for AuthZ
# /var/lib/kubelet/config.yaml (on each node)
authentication:
  anonymous:
    enabled: false
  webhook:
    enabled: true
    cacheTTL: 2m0s
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s

With authorization.mode: Webhook, all Kubelet endpoint access goes through kube-apiserver RBAC, making nodes/proxy grants visible and auditable.

Step 6 — Alert on subresource access in the audit log

Configure the kube-apiserver audit policy to capture all subresource operations at RequestResponse level:

# kube-apiserver audit policy
- level: RequestResponse
  verbs: ["create"]
  resources:
  - group: ""
    resources:
    - pods/exec
    - pods/portforward
    - pods/attach
- level: Request
  verbs: ["get"]
  resources:
  - group: ""
    resources:
    - pods/log
    - nodes/proxy

Alert rule for unexpected exec activity:

# Prometheus alerting rule (using kube-apiserver audit log exported to metrics)
- alert: UnexpectedPodExec
  expr: |
    sum by (user_username, objectRef_namespace, objectRef_name) (
      rate(apiserver_audit_event_total{
        verb="create",
        resource="pods",
        subresource="exec"
      }[5m])
    ) > 0
  labels:
    severity: warning
  annotations:
    summary: "pods/exec called by {{ $labels.user_username }}"
    description: "Exec into pod {{ $labels.objectRef_name }} in {{ $labels.objectRef_namespace }}"

Expected Behaviour

Signal Before hardening After hardening
Developer kubeconfig can kubectl exec into production pods Permitted (standing grant) Denied; requires JIT request with ticket
kubectl auth can-i create pods/exec --as developer@example.com yes no
Prometheus scrape works without nodes/proxy Requires nodes/proxy Works via nodes/metrics only
Kyverno audit: roles with exec grants lacking annotation No alert Policy violation recorded
kube-apiserver audit log captures exec operations May not be at RequestResponse level All exec/portforward logged with full request body

Verification:

# Check no standing exec grants outside kube-system
kubectl get clusterroles -o json | jq -r '
  .items[] |
  select(
    .metadata.name | test("^system:") | not
  ) |
  select(
    .rules[]? |
    (.resources[]? | test("exec|portforward|proxy")) and
    (.verbs[]? | test("create|get|\\*"))
  ) |
  .metadata.name
'
# Expected: empty output

# Confirm Kubelet auth mode
curl -sk https://NODE-IP:10250/metrics 2>&1 | head -3
# Expected: 401 Unauthorized (not returning metrics without auth)

Trade-offs

Aspect Benefit Cost Mitigation
Removing standing exec grants Eliminates standing compromise path Debugging workflows break; developers must request access Implement JIT access tool with < 5 minute provisioning time; integrate with Slack bot
JIT access via RBAC bindings Auditable; time-limited Requires tooling to manage; at-based cleanup is fragile Use a purpose-built JIT tool (Teleport, Pomerium) or a Kubernetes operator that reconciles TTL-annotated bindings
Replacing nodes/proxy with nodes/metrics Removes node RCE path from monitoring role Some monitoring tools hard-code nodes/proxy File upstream issue; short-term: namespace-scope the nodes/proxy grant to the monitoring namespace only
Kyverno enforcement on new roles Prevents RBAC sprawl from re-introducing grants Blocks Helm chart installs that include exec grants Switch to Audit mode initially; create per-chart exceptions with ticket references

Failure Modes

Failure Symptom Detection Recovery
JIT access cleanup fails Expired bindings remain active beyond TTL Cron job or monitoring checks for bindings with past expires-at annotations Implement reconciliation loop that deletes expired bindings every 5 minutes
Operator or controller requires pods/exec at startup Operator crashloops with 403 on exec call Operator logs show Forbidden; events show RBAC failure Audit operator for legitimate exec use; if needed, scope exec grant to its own pod/namespace only
Kyverno policy blocks emergency access during incident On-call engineer cannot exec into crashing pod Engineer receives 403; escalation required Maintain documented break-glass procedure: annotated role + binding that bypasses Kyverno via kyverno.io/policy-ignore annotation on the binding, requiring two-person authorization
Audit log does not capture exec to all clusters Some nodes use local Kubelet authz; exec bypasses kube-apiserver audit Audit coverage gap discovered in post-incident review Enforce Kubelet authorization.mode: Webhook on all nodes; verify with kube-bench