ValidatingAdmissionPolicy with CEL: Native Kubernetes Admission Without Webhooks

ValidatingAdmissionPolicy with CEL: Native Kubernetes Admission Without Webhooks

Problem

Webhook-based admission control (Kyverno, Gatekeeper, OPA, custom webhooks) has been the dominant pattern for enforcing organization-specific policies on Kubernetes resources for years. It works, but it brings four classes of risk that go unmentioned in most adoption stories:

  • Availability coupling. Every admission request to the affected resources blocks until the webhook responds. A webhook that goes down stops cluster operations cold. failurePolicy: Ignore makes the webhook optional, which means a partial outage silently lets violations through.
  • Network round-trip cost. Each admission decision crosses the cluster network, hits the webhook pod, runs evaluation logic, and returns. Latency is 5-50 ms per request, accumulating during burst deploys.
  • Operational footprint. A webhook pod needs Deployments, Services, certificates (the cert-manager dance), CA bundle injection into the ValidatingWebhookConfiguration, monitoring, autoscaling, and security review of the policy engine itself.
  • Versioning skew. Updating Kyverno or Gatekeeper means upgrading the policy engine in lockstep with the policies, often through Helm chart migrations across breaking versions.

Kubernetes 1.30 (April 2024) made ValidatingAdmissionPolicy (VAP) generally available. The kube-apiserver evaluates CEL (Common Expression Language) expressions inline during admission, with no webhook in the path. Kubernetes 1.32 added MutatingAdmissionPolicy (still alpha as of 1.34, beta on 1.35).

For the majority of policies — naming conventions, label requirements, image registry allowlists, resource quotas, securityContext requirements — VAP is the right fit. Webhook engines remain useful for policies that need cross-resource lookups across namespaces, external API calls, or complex stateful logic.

This article covers the VAP resource model, common CEL patterns for security policies, parameterization, RBAC for policy management, and the operational migration from Kyverno/OPA where applicable.

Target systems: Kubernetes 1.30+ for VAP GA. Kubernetes 1.34+ for MatchConditions v2 and improved error messages.

Threat Model

  • Adversary 1 — Insider creating non-compliant resources: developer with namespace-scoped access who attempts to deploy a privileged pod, an image from an unapproved registry, or a workload bypassing required labels.
  • Adversary 2 — External attacker via compromised CI credentials: OIDC-federated CI token used to apply a manifest that escalates privileges.
  • Adversary 3 — Webhook outage as bypass vector: an attacker (or routine networking incident) that brings down the policy webhook so policies fail open.
  • Access level: Adversary 1 has namespace-edit RBAC. Adversary 2 has whatever the CI token grants. Adversary 3 has any disruption capability — even a cluster autoscaler event is enough.
  • Objective: Deploy resources that violate organizational policy in a way that gives the adversary persistent access, more permissions, or evades detection.
  • Blast radius: Without admission control: any privileged workload reachable by the cluster network or with hostPath mount can pivot to node-level access. With webhook-based control: blast radius depends on webhook availability. With VAP: same blast radius as Kubernetes RBAC, no separate availability concern.

Configuration

The VAP Resource Trio

VAP uses three resources that compose:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: deny-privileged-containers
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["pods"]
  validations:
    - expression: >
        !object.spec.containers.exists(c,
          has(c.securityContext) && has(c.securityContext.privileged) &&
          c.securityContext.privileged == true)
      message: "Privileged containers are not allowed."
      reason: Forbidden
    - expression: >
        !has(object.spec.initContainers) ||
        !object.spec.initContainers.exists(c,
          has(c.securityContext) && has(c.securityContext.privileged) &&
          c.securityContext.privileged == true)
      message: "Privileged init containers are not allowed."
      reason: Forbidden
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: deny-privileged-containers-everywhere
spec:
  policyName: deny-privileged-containers
  validationActions: [Deny, Audit]
  matchResources:
    namespaceSelector:
      matchExpressions:
        - key: pod-security.kubernetes.io/enforce
          operator: NotIn
          values: ["privileged"]

The Policy defines what to check. The Binding determines where the policy applies and what to do on a violation. validationActions: [Deny, Audit] rejects the request and emits an audit event with the violation details.

Image Registry Allowlist with Parameters

For policies whose values vary across environments (allowed registries differ between staging and prod), use a ParamKind:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: enforce-image-registry-allowlist
spec:
  paramKind:
    apiVersion: policy.example.com/v1
    kind: AllowedRegistries
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["pods"]
  variables:
    - name: containerImages
      expression: >
        object.spec.containers.map(c, c.image) +
        (has(object.spec.initContainers) ?
          object.spec.initContainers.map(c, c.image) : [])
  validations:
    - expression: >
        variables.containerImages.all(img,
          params.spec.registries.exists(r, img.startsWith(r + "/")))
      messageExpression: >
        "Image must come from one of: " +
        params.spec.registries.join(", ")
      reason: Forbidden
---
# CRD for the parameter object.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: allowedregistries.policy.example.com
spec:
  group: policy.example.com
  scope: Cluster
  names:
    plural: allowedregistries
    kind: AllowedRegistries
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                registries:
                  type: array
                  items:
                    type: string
---
apiVersion: policy.example.com/v1
kind: AllowedRegistries
metadata:
  name: production-registries
spec:
  registries:
    - ghcr.io/myorg
    - my-registry.example.com
    - quay.io/myorg
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: registry-allowlist-prod
spec:
  policyName: enforce-image-registry-allowlist
  paramRef:
    name: production-registries
    parameterNotFoundAction: Deny
  validationActions: [Deny, Audit]
  matchResources:
    namespaceSelector:
      matchLabels:
        environment: production

Different bindings reference different AllowedRegistries instances for staging vs. production. The same policy logic, parameterized.

Common Security Patterns

# Require non-root user.
- expression: >
    object.spec.containers.all(c,
      has(c.securityContext) &&
      has(c.securityContext.runAsNonRoot) &&
      c.securityContext.runAsNonRoot == true)
  message: "All containers must set runAsNonRoot: true"

# Require resource limits.
- expression: >
    object.spec.containers.all(c,
      has(c.resources) &&
      has(c.resources.limits) &&
      has(c.resources.limits.cpu) &&
      has(c.resources.limits.memory))
  message: "All containers must specify CPU and memory limits"

# Forbid hostPath mounts.
- expression: >
    !has(object.spec.volumes) ||
    object.spec.volumes.all(v, !has(v.hostPath))
  message: "hostPath volumes are not allowed"

# Forbid hostNetwork / hostPID / hostIPC.
- expression: >
    !has(object.spec.hostNetwork) || !object.spec.hostNetwork
  message: "hostNetwork is not allowed"
- expression: >
    !has(object.spec.hostPID) || !object.spec.hostPID
  message: "hostPID is not allowed"

# Require seccompProfile.
- expression: >
    has(object.spec.securityContext) &&
    has(object.spec.securityContext.seccompProfile) &&
    object.spec.securityContext.seccompProfile.type in
      ["RuntimeDefault", "Localhost"]
  message: "seccompProfile must be RuntimeDefault or Localhost"

# Require approved labels.
- expression: >
    has(object.metadata.labels) &&
    "app.kubernetes.io/name" in object.metadata.labels &&
    "team" in object.metadata.labels
  message: "Pods must have app.kubernetes.io/name and team labels"

These cover most of the day-to-day “PSS Restricted +” enforcement teams write Kyverno policies for.

Cross-Resource Lookups via extensions.k8s.io Variables

Kubernetes 1.32+ supports limited cross-resource lookups via the Authorizer and RequestResource extensions. For a policy that depends on a ConfigMap value (e.g., a list of approved teams):

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: team-must-be-approved
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["pods"]
  paramKind:
    apiVersion: v1
    kind: ConfigMap
  variables:
    - name: team
      expression: >
        has(object.metadata.labels) &&
        has(object.metadata.labels.team) ?
          object.metadata.labels.team : ""
    - name: approvedTeams
      expression: params.data.teams.split(",")
  validations:
    - expression: variables.team in variables.approvedTeams
      messageExpression: >
        "Team '" + variables.team + "' is not in the approved list"
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: team-approval
spec:
  policyName: team-must-be-approved
  paramRef:
    name: approved-teams
    namespace: kube-system
    parameterNotFoundAction: Deny
  validationActions: [Deny, Audit]

For lookups across arbitrary resources or external systems, VAP is not the right tool — fall back to Kyverno or a custom webhook.

Auditing Without Enforcing (Dry-Run)

Before flipping a policy to Deny, run it in Audit mode. Violations appear in the audit log without rejecting requests:

spec:
  validationActions: [Audit, Warn]

Combined with audit-log analysis (a query against your SIEM for annotations.validation.policy.admission.k8s.io/validation_failure), you discover which workloads would have been rejected and can fix them before enforcement.

RBAC for Policy Management

Policies are cluster-scoped resources with elevated impact. Restrict who can write them:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: admission-policy-author
rules:
  - apiGroups: ["admissionregistration.k8s.io"]
    resources:
      - validatingadmissionpolicies
      - validatingadmissionpolicybindings
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
# Reserve for the security/platform team only.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: security-team-policy-authors
roleRef:
  kind: ClusterRole
  name: admission-policy-author
  apiGroup: rbac.authorization.k8s.io
subjects:
  - kind: Group
    name: security-engineering
    apiGroup: rbac.authorization.k8s.io

Application teams should not have create/update on these resources. The ParamKind parameters can be more permissive — a team-specific AllowedRegistries instance can be edited by the team itself if scoped correctly.

Expected Behaviour

Signal Webhook Engine VAP
Policy evaluation latency 5-50 ms (network round-trip + engine eval) < 1 ms (in-process CEL)
Webhook pod outage impact Cluster admission stalls or fails open No webhook involved; no impact
Cluster CRD count Many (Kyverno: ~10, Gatekeeper: ~5) Two (ValidatingAdmissionPolicy, ValidatingAdmissionPolicyBinding) plus your ParamKind CRDs
Audit-log entry on violation Webhook annotation Native annotations.validation.policy.admission.k8s.io/validation_failure
Policy rollout via GitOps Argo/Flux apply Kyverno CRDs Same — applies VAP CRDs (built-in API group)
Cross-resource queries Native via Kyverno match and context Limited; falls back to webhook

Verify VAP enforcement:

# Apply a violating pod, expect rejection.
kubectl run test --image=docker.io/library/nginx --dry-run=client -o yaml | \
  kubectl apply -f -
# Error from server (Forbidden): admission webhook denied the request:
# Image must come from one of: ghcr.io/myorg, ...

# Audit log shows the violation.
kubectl get --raw /api/v1/namespaces/kube-system/events | \
  jq '.items[] | select(.reason == "PolicyViolation")'

Trade-offs

Aspect Benefit Cost Mitigation
In-process evaluation No network round-trip; no availability risk CEL has a smaller standard library than Rego or Kyverno’s expression language Use VAP for the policies CEL handles cleanly; keep webhook engines for the rest.
Native API surface No third-party CRDs to upgrade in lockstep with Kubernetes Limited cross-resource awareness; cannot call external systems Use paramRef for static config; for dynamic lookups, keep webhook-based engines.
ParamKind for env-specific values Single policy, multiple parameter objects per environment Requires defining a CRD for each parameter shape Use ConfigMap as the param kind for simple cases.
Audit mode rollout Safe deploy of new policies Requires audit-log pipeline to make use of the data Pipe audit logs to your existing SIEM; query for validation_failure annotations.
Migration from Kyverno/OPA Reduces operational footprint Migration time; not all policies port cleanly Inventory policies first. Convert the obvious 80%; leave webhook engines for the long-tail policies that need their richer features.
RBAC tightness on policies Policy authors are a small set New policy creation is a slow, gated process Use parameters (ParamKind) to push environment-specific configuration to teams; keep policy logic gated.

Failure Modes

Failure Symptom Detection Recovery
CEL syntax error Policy not enforced; kube-apiserver logs compilation failed kubectl describe vap shows TypeChecking condition with the error Validate CEL with kubectl alpha admissionpolicy lint (1.32+) or test in a kind cluster before pushing.
Param resource missing Bound policy fails open or fails closed depending on parameterNotFoundAction Audit logs show admission attempts with param-not-found annotation Set parameterNotFoundAction: Deny for security policies. Ensure GitOps applies parameter resources before bindings.
failurePolicy: Ignore set on a security policy Violations slip through when the apiserver evaluator hits an internal error Audit logs missing expected violations during apiserver health blips Use failurePolicy: Fail for security policies. The apiserver evaluator has no external dependencies, so failure is rare and indicates a cluster-level issue worth blocking on.
Policy too restrictive, blocks platform components New cluster components (cert-manager controller, GPU operator) fail to install Pod create requests rejected with policy message; system events flood Use matchResources.namespaceSelector with kubernetes.io/metadata.name NotIn [kube-system, ...]. Exempt namespaces with the pod-security.kubernetes.io/enforce: privileged label or a dedicated policy-exempt label.
validations evaluation hits CEL cost limit Policies on resources with very large fields (large ConfigMaps, status fields) fail evaluation Audit logs show cost limit exceeded Restructure the expression to short-circuit early; use variables to extract subsets and avoid repeated traversal.
Audit annotations missed in SIEM Violations occur but nobody knows Spot-check shows audit-log entries with policy annotations not appearing in SIEM dashboards Confirm the audit-log pipeline forwards metadata.annotations and that SIEM indexes them. Build a dashboard on validation_failure annotation count by policy.
Mass policy update breaks production A bad CEL change rejects all pod creates cluster-wide New deploys fail across all namespaces immediately after a policy update Roll out new policies via [Audit] first, observe for 1-2 days, then add Deny. Keep kubectl rollout undo-equivalent: a Git revert of the policy commit triggers GitOps to re-apply the prior version.

Migrating from Kyverno or Gatekeeper

Most existing Kyverno validate rules and Gatekeeper constraints map to VAP. Walk the policy inventory and bucket each:

Existing policy type VAP-portable? Notes
Required labels / annotations Yes Direct CEL translation.
Image registry allowlist Yes Use paramKind for environment differences.
Privileged container deny Yes Native CEL on securityContext.
Resource limits required Yes Direct CEL.
Network policy default-deny No (mutate / generate) Stays in Kyverno (generate rules).
Cross-namespace consistency check No (cross-resource lookup) Stays in webhook engine.
External API call (CMDB lookup) No Custom webhook.

Run both in parallel during migration. VAP in [Audit], Kyverno in enforce. Once VAP audit logs show parity over 1-2 weeks, switch VAP to [Deny, Audit] and remove the corresponding Kyverno policies.