ValidatingAdmissionPolicy with CEL: Native Kubernetes Admission Without Webhooks

Problem

Webhook-based admission control (Kyverno, Gatekeeper, OPA, custom webhooks) has been the dominant pattern for enforcing organization-specific policies on Kubernetes resources for years. It works, but it brings four classes of risk that go unmentioned in most adoption stories:

Availability coupling. Every admission request to the affected resources blocks until the webhook responds. A webhook that goes down stops cluster operations cold. failurePolicy: Ignore makes the webhook optional, which means a partial outage silently lets violations through.
Network round-trip cost. Each admission decision crosses the cluster network, hits the webhook pod, runs evaluation logic, and returns. Latency is 5-50 ms per request, accumulating during burst deploys.
Operational footprint. A webhook pod needs Deployments, Services, certificates (the cert-manager dance), CA bundle injection into the ValidatingWebhookConfiguration, monitoring, autoscaling, and security review of the policy engine itself.
Versioning skew. Updating Kyverno or Gatekeeper means upgrading the policy engine in lockstep with the policies, often through Helm chart migrations across breaking versions.

Kubernetes 1.30 (April 2024) made ValidatingAdmissionPolicy (VAP) generally available. The kube-apiserver evaluates CEL (Common Expression Language) expressions inline during admission, with no webhook in the path. Kubernetes 1.32 added MutatingAdmissionPolicy (still alpha as of 1.34, beta on 1.35).

For the majority of policies — naming conventions, label requirements, image registry allowlists, resource quotas, securityContext requirements — VAP is the right fit. Webhook engines remain useful for policies that need cross-resource lookups across namespaces, external API calls, or complex stateful logic.

This article covers the VAP resource model, common CEL patterns for security policies, parameterization, RBAC for policy management, and the operational migration from Kyverno/OPA where applicable.

Target systems: Kubernetes 1.30+ for VAP GA. Kubernetes 1.34+ for MatchConditions v2 and improved error messages.

Threat Model

Adversary 1 — Insider creating non-compliant resources: developer with namespace-scoped access who attempts to deploy a privileged pod, an image from an unapproved registry, or a workload bypassing required labels.
Adversary 2 — External attacker via compromised CI credentials: OIDC-federated CI token used to apply a manifest that escalates privileges.
Adversary 3 — Webhook outage as bypass vector: an attacker (or routine networking incident) that brings down the policy webhook so policies fail open.
Access level: Adversary 1 has namespace-edit RBAC. Adversary 2 has whatever the CI token grants. Adversary 3 has any disruption capability — even a cluster autoscaler event is enough.
Objective: Deploy resources that violate organizational policy in a way that gives the adversary persistent access, more permissions, or evades detection.
Blast radius: Without admission control: any privileged workload reachable by the cluster network or with hostPath mount can pivot to node-level access. With webhook-based control: blast radius depends on webhook availability. With VAP: same blast radius as Kubernetes RBAC, no separate availability concern.

Configuration

The VAP Resource Trio

VAP uses three resources that compose:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: deny-privileged-containers
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["pods"]
  validations:
    - expression: >
        !object.spec.containers.exists(c,
          has(c.securityContext) && has(c.securityContext.privileged) &&
          c.securityContext.privileged == true)
      message: "Privileged containers are not allowed."
      reason: Forbidden
    - expression: >
        !has(object.spec.initContainers) ||
        !object.spec.initContainers.exists(c,
          has(c.securityContext) && has(c.securityContext.privileged) &&
          c.securityContext.privileged == true)
      message: "Privileged init containers are not allowed."
      reason: Forbidden
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: deny-privileged-containers-everywhere
spec:
  policyName: deny-privileged-containers
  validationActions: [Deny, Audit]
  matchResources:
    namespaceSelector:
      matchExpressions:
        - key: pod-security.kubernetes.io/enforce
          operator: NotIn
          values: ["privileged"]

The Policy defines what to check. The Binding determines where the policy applies and what to do on a violation. validationActions: [Deny, Audit] rejects the request and emits an audit event with the violation details.

Image Registry Allowlist with Parameters

For policies whose values vary across environments (allowed registries differ between staging and prod), use a ParamKind:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: enforce-image-registry-allowlist
spec:
  paramKind:
    apiVersion: policy.example.com/v1
    kind: AllowedRegistries
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["pods"]
  variables:
    - name: containerImages
      expression: >
        object.spec.containers.map(c, c.image) +
        (has(object.spec.initContainers) ?
          object.spec.initContainers.map(c, c.image) : [])
  validations:
    - expression: >
        variables.containerImages.all(img,
          params.spec.registries.exists(r, img.startsWith(r + "/")))
      messageExpression: >
        "Image must come from one of: " +
        params.spec.registries.join(", ")
      reason: Forbidden
---
# CRD for the parameter object.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: allowedregistries.policy.example.com
spec:
  group: policy.example.com
  scope: Cluster
  names:
    plural: allowedregistries
    kind: AllowedRegistries
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                registries:
                  type: array
                  items:
                    type: string
---
apiVersion: policy.example.com/v1
kind: AllowedRegistries
metadata:
  name: production-registries
spec:
  registries:
    - ghcr.io/myorg
    - my-registry.example.com
    - quay.io/myorg
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: registry-allowlist-prod
spec:
  policyName: enforce-image-registry-allowlist
  paramRef:
    name: production-registries
    parameterNotFoundAction: Deny
  validationActions: [Deny, Audit]
  matchResources:
    namespaceSelector:
      matchLabels:
        environment: production

Different bindings reference different AllowedRegistries instances for staging vs. production. The same policy logic, parameterized.

Common Security Patterns

# Require non-root user.
- expression: >
    object.spec.containers.all(c,
      has(c.securityContext) &&
      has(c.securityContext.runAsNonRoot) &&
      c.securityContext.runAsNonRoot == true)
  message: "All containers must set runAsNonRoot: true"

# Require resource limits.
- expression: >
    object.spec.containers.all(c,
      has(c.resources) &&
      has(c.resources.limits) &&
      has(c.resources.limits.cpu) &&
      has(c.resources.limits.memory))
  message: "All containers must specify CPU and memory limits"

# Forbid hostPath mounts.
- expression: >
    !has(object.spec.volumes) ||
    object.spec.volumes.all(v, !has(v.hostPath))
  message: "hostPath volumes are not allowed"

# Forbid hostNetwork / hostPID / hostIPC.
- expression: >
    !has(object.spec.hostNetwork) || !object.spec.hostNetwork
  message: "hostNetwork is not allowed"
- expression: >
    !has(object.spec.hostPID) || !object.spec.hostPID
  message: "hostPID is not allowed"

# Require seccompProfile.
- expression: >
    has(object.spec.securityContext) &&
    has(object.spec.securityContext.seccompProfile) &&
    object.spec.securityContext.seccompProfile.type in
      ["RuntimeDefault", "Localhost"]
  message: "seccompProfile must be RuntimeDefault or Localhost"

# Require approved labels.
- expression: >
    has(object.metadata.labels) &&
    "app.kubernetes.io/name" in object.metadata.labels &&
    "team" in object.metadata.labels
  message: "Pods must have app.kubernetes.io/name and team labels"

These cover most of the day-to-day “PSS Restricted +” enforcement teams write Kyverno policies for.

Cross-Resource Lookups via `extensions.k8s.io` Variables

Kubernetes 1.32+ supports limited cross-resource lookups via the Authorizer and RequestResource extensions. For a policy that depends on a ConfigMap value (e.g., a list of approved teams):

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: team-must-be-approved
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["pods"]
  paramKind:
    apiVersion: v1
    kind: ConfigMap
  variables:
    - name: team
      expression: >
        has(object.metadata.labels) &&
        has(object.metadata.labels.team) ?
          object.metadata.labels.team : ""
    - name: approvedTeams
      expression: params.data.teams.split(",")
  validations:
    - expression: variables.team in variables.approvedTeams
      messageExpression: >
        "Team '" + variables.team + "' is not in the approved list"
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: team-approval
spec:
  policyName: team-must-be-approved
  paramRef:
    name: approved-teams
    namespace: kube-system
    parameterNotFoundAction: Deny
  validationActions: [Deny, Audit]

For lookups across arbitrary resources or external systems, VAP is not the right tool — fall back to Kyverno or a custom webhook.

Auditing Without Enforcing (Dry-Run)

Before flipping a policy to Deny, run it in Audit mode. Violations appear in the audit log without rejecting requests:

spec:
  validationActions: [Audit, Warn]

Combined with audit-log analysis (a query against your SIEM for annotations.validation.policy.admission.k8s.io/validation_failure), you discover which workloads would have been rejected and can fix them before enforcement.

RBAC for Policy Management

Policies are cluster-scoped resources with elevated impact. Restrict who can write them:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: admission-policy-author
rules:
  - apiGroups: ["admissionregistration.k8s.io"]
    resources:
      - validatingadmissionpolicies
      - validatingadmissionpolicybindings
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
# Reserve for the security/platform team only.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: security-team-policy-authors
roleRef:
  kind: ClusterRole
  name: admission-policy-author
  apiGroup: rbac.authorization.k8s.io
subjects:
  - kind: Group
    name: security-engineering
    apiGroup: rbac.authorization.k8s.io

Application teams should not have create/update on these resources. The ParamKind parameters can be more permissive — a team-specific AllowedRegistries instance can be edited by the team itself if scoped correctly.

Expected Behaviour

Signal	Webhook Engine	VAP
Policy evaluation latency	5-50 ms (network round-trip + engine eval)	< 1 ms (in-process CEL)
Webhook pod outage impact	Cluster admission stalls or fails open	No webhook involved; no impact
Cluster CRD count	Many (Kyverno: ~10, Gatekeeper: ~5)	Two (`ValidatingAdmissionPolicy`, `ValidatingAdmissionPolicyBinding`) plus your `ParamKind` CRDs
Audit-log entry on violation	Webhook annotation	Native `annotations.validation.policy.admission.k8s.io/validation_failure`
Policy rollout via GitOps	Argo/Flux apply Kyverno CRDs	Same — applies VAP CRDs (built-in API group)
Cross-resource queries	Native via Kyverno `match` and `context`	Limited; falls back to webhook

Verify VAP enforcement:

# Apply a violating pod, expect rejection.
kubectl run test --image=docker.io/library/nginx --dry-run=client -o yaml | \
  kubectl apply -f -
# Error from server (Forbidden): admission webhook denied the request:
# Image must come from one of: ghcr.io/myorg, ...

# Audit log shows the violation.
kubectl get --raw /api/v1/namespaces/kube-system/events | \
  jq '.items[] | select(.reason == "PolicyViolation")'

Trade-offs

Aspect	Benefit	Cost	Mitigation
In-process evaluation	No network round-trip; no availability risk	CEL has a smaller standard library than Rego or Kyverno’s expression language	Use VAP for the policies CEL handles cleanly; keep webhook engines for the rest.
Native API surface	No third-party CRDs to upgrade in lockstep with Kubernetes	Limited cross-resource awareness; cannot call external systems	Use `paramRef` for static config; for dynamic lookups, keep webhook-based engines.
`ParamKind` for env-specific values	Single policy, multiple parameter objects per environment	Requires defining a CRD for each parameter shape	Use `ConfigMap` as the param kind for simple cases.
`Audit` mode rollout	Safe deploy of new policies	Requires audit-log pipeline to make use of the data	Pipe audit logs to your existing SIEM; query for `validation_failure` annotations.
Migration from Kyverno/OPA	Reduces operational footprint	Migration time; not all policies port cleanly	Inventory policies first. Convert the obvious 80%; leave webhook engines for the long-tail policies that need their richer features.
RBAC tightness on policies	Policy authors are a small set	New policy creation is a slow, gated process	Use parameters (`ParamKind`) to push environment-specific configuration to teams; keep policy logic gated.

Failure Modes

Failure	Symptom	Detection	Recovery
CEL syntax error	Policy not enforced; kube-apiserver logs `compilation failed`	`kubectl describe vap` shows `TypeChecking` condition with the error	Validate CEL with `kubectl alpha admissionpolicy lint` (1.32+) or test in a kind cluster before pushing.
Param resource missing	Bound policy fails open or fails closed depending on `parameterNotFoundAction`	Audit logs show admission attempts with `param-not-found` annotation	Set `parameterNotFoundAction: Deny` for security policies. Ensure GitOps applies parameter resources before bindings.
`failurePolicy: Ignore` set on a security policy	Violations slip through when the apiserver evaluator hits an internal error	Audit logs missing expected violations during apiserver health blips	Use `failurePolicy: Fail` for security policies. The apiserver evaluator has no external dependencies, so failure is rare and indicates a cluster-level issue worth blocking on.
Policy too restrictive, blocks platform components	New cluster components (cert-manager controller, GPU operator) fail to install	Pod create requests rejected with policy message; system events flood	Use `matchResources.namespaceSelector` with `kubernetes.io/metadata.name NotIn [kube-system, ...]`. Exempt namespaces with the `pod-security.kubernetes.io/enforce: privileged` label or a dedicated `policy-exempt` label.
`validations` evaluation hits CEL cost limit	Policies on resources with very large fields (large ConfigMaps, status fields) fail evaluation	Audit logs show `cost limit exceeded`	Restructure the expression to short-circuit early; use `variables` to extract subsets and avoid repeated traversal.
Audit annotations missed in SIEM	Violations occur but nobody knows	Spot-check shows audit-log entries with policy annotations not appearing in SIEM dashboards	Confirm the audit-log pipeline forwards `metadata.annotations` and that SIEM indexes them. Build a dashboard on `validation_failure` annotation count by policy.
Mass policy update breaks production	A bad CEL change rejects all pod creates cluster-wide	New deploys fail across all namespaces immediately after a policy update	Roll out new policies via `[Audit]` first, observe for 1-2 days, then add `Deny`. Keep `kubectl rollout undo`-equivalent: a Git revert of the policy commit triggers GitOps to re-apply the prior version.

Migrating from Kyverno or Gatekeeper

Most existing Kyverno validate rules and Gatekeeper constraints map to VAP. Walk the policy inventory and bucket each:

Existing policy type	VAP-portable?	Notes
Required labels / annotations	Yes	Direct CEL translation.
Image registry allowlist	Yes	Use `paramKind` for environment differences.
Privileged container deny	Yes	Native CEL on `securityContext`.
Resource limits required	Yes	Direct CEL.
Network policy default-deny	No (mutate / generate)	Stays in Kyverno (`generate` rules).
Cross-namespace consistency check	No (cross-resource lookup)	Stays in webhook engine.
External API call (CMDB lookup)	No	Custom webhook.

Run both in parallel during migration. VAP in [Audit], Kyverno in enforce. Once VAP audit logs show parity over 1-2 weeks, switch VAP to [Deny, Audit] and remove the corresponding Kyverno policies.