eBPF Verifier Security Hardening
Problem
The eBPF verifier is the kernel subsystem that statically analyses every eBPF program before it is loaded into the kernel. Before a program can be attached to a kprobe, TC hook, XDP driver, or LSM hook, the verifier walks every possible execution path through the program’s bytecode and must prove that the program cannot perform out-of-bounds memory accesses, cannot loop unboundedly, and cannot perform illegal pointer arithmetic. The verifier is the primary and often only security boundary between user-submitted eBPF programs and the kernel’s own memory. If an attacker can trick the verifier into accepting a program it has incorrectly declared safe, the attacker gains the ability to read and write arbitrary kernel memory from user space — which in practice means full local privilege escalation to root, and often container escape.
The verifier’s job is harder than it sounds. It does not execute the program; it maintains an abstract model of every register’s possible value range at every instruction. For each ALU operation, pointer dereference, or map lookup, the verifier must track whether the resulting register could hold a value outside a safe range. This abstract interpretation is implemented in roughly 15,000 lines of kernel/bpf/verifier.c, supported by a range-tracking library in kernel/bpf/tnum.c (tristate numbers, tracking which bits are definitely zero, definitely one, or unknown). The correctness of this model depends on precise handling of every ALU opcode, every conditional branch, and every type-narrowing inference. A single edge case where the abstract model diverges from what the hardware actually computes is sufficient for an attacker to break the security guarantee.
GHSA-hfqc-63c7-rj9f (April 2026, discovered by Google Security Research) is exactly this class of bug. The verifier’s tracking of register value ranges had an edge case in the handling of certain ALU operations on 32-bit sub-registers. The Linux eBPF architecture defines both 64-bit (r0–r10) and 32-bit (w0–w10) register views; operations on 32-bit sub-registers are supposed to be zero-extended to 64 bits before the result is used in pointer arithmetic. The bug: certain ALU operations on 32-bit registers were being widened to 64-bit ranges by the verifier’s abstract model without correctly re-constraining the upper 32 bits. An attacker could construct a crafted eBPF program in which the verifier believed a memory offset was bounded within a safe range, while at runtime the actual 64-bit value of the register — after the silent widening — held an attacker-controlled offset. The result: the verifier would declare the program safe, and at runtime the program could dereference a kernel pointer with an attacker-controlled offset, enabling arbitrary kernel reads and writes. The fix was committed to the kernel’s bpf tree and the full advisory is published at https://github.com/google/security-research/security/advisories/GHSA-hfqc-63c7-rj9f.
A critical operational reality — and the central open-source angle of this article — is the gap between when a verifier fix lands in the upstream kernel tree and when it reaches a distribution kernel package. Many eBPF verifier bug fixes are committed to the bpf-next or bpf kernel trees with commit messages like bpf: fix verifier range tracking for 32-bit ALU ops. No CVE is mentioned. No security advisory accompanies the commit. The commit appears in the log of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git within hours of being merged, but a sysadmin running Ubuntu 24.04 or RHEL 9 may wait 2–8 weeks before the fix appears in a distribution kernel update. During that window, a patch-gap attacker who monitors the bpf tree can read the diff, understand the verifier’s corrected abstract model, and construct a proof-of-concept that triggers the unpatched path. Google’s Project Zero and the broader research community have documented multiple eBPF verifier bugs with this pattern; several have had 6–12 week windows between upstream fix and distribution shipping. GHSA-hfqc-63c7-rj9f is more transparent than most — Google Security Research published a structured advisory — but even here the kernel commit predated the coordinated public disclosure.
Monitoring for verifier fixes before they acquire a CVE requires watching the right channels. The canonical source is the bpf kernel tree itself: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/log/ — filter commits touching kernel/bpf/verifier.c and kernel/bpf/tnum.c. The OSV database (https://osv.dev) indexes kernel CVEs and can be queried for BPF-tagged advisories. The oss-security@openwall.com mailing list is where coordinated kernel security disclosures are made public; BPF disclosure threads appear there for bugs that do get CVEs. Subscribe to linux-kernel-announce for major kernel releases, and watch your distribution’s kernel security mailing list (Ubuntu’s ubuntu-security-announce, Red Hat’s errata feed) for the downstream fix notification. Combining upstream tree monitoring with OSV queries gives the earliest possible warning.
Target systems: Linux kernel >= 4.4 with CONFIG_BPF_JIT=y and kernel.unprivileged_bpf_disabled=0 (the default on most distributions), including Ubuntu 22.04+, Ubuntu 24.04+, Debian 12+, and RHEL 9/10. Any system where an unprivileged user can call bpf(BPF_PROG_LOAD, ...) is in scope. Systems that have already set kernel.unprivileged_bpf_disabled=2 are protected from unprivileged exploitation but remain at risk from any process holding CAP_BPF or CAP_SYS_ADMIN.
Threat Model
-
Unprivileged local user privilege escalation. An attacker with a shell account — or code execution in any process running as a non-root UID — calls
bpf(BPF_PROG_LOAD, ...)with a crafted eBPF program that exploits a verifier register range tracking bug. The verifier accepts the program as safe. At runtime the program performs kernel arbitrary read/write using an attacker-controlled pointer offset. The attacker reads kernel credentials structures, overwritestask_struct.cred, and escalates to root. This is the direct exploitation path for GHSA-hfqc-63c7-rj9f. -
Container workload with
CAP_BPFescaping to the host. A Kubernetes pod withsecurityContext.capabilities.add: ["BPF"]— or a DaemonSet running Cilium, Tetragon, or Falco with elevated privileges — is compromised through application-layer code injection. The attacker uses the process’s existingCAP_BPFto load a crafted BPF program that exploits the verifier. Because BPF programs share the host kernel’s address space regardless of container namespacing, a successful verifier bypass inside a container is a container escape: the attacker can read or write host kernel memory. -
Patch-gap attacker. A sophisticated attacker monitors
git.kernel.org/bpffor commits touchingkernel/bpf/verifier.c. When a commit appears that changes register range tracking logic — particularly around 32-bit sub-register ALU operations ortnumrange constraints — the attacker reads the diff, identifies the pre-fix logical path, and constructs an eBPF program that exercises that path. Distribution kernels typically lag the upstreambpftree by 2–8 weeks. The attacker holds a working proof-of-concept during this window, targeting systems where unprivileged BPF is enabled and the distribution kernel has not yet shipped the fix. -
CI/CD runner as BPF loading vector. Many CI/CD platforms deploy observability agents (Cilium for network visibility, Tetragon for process auditing, Falco for syscall monitoring) that require
CAP_BPForCAP_SYS_ADMIN. An attacker who achieves code injection in the CI pipeline — through a compromised build script, a malicious dependency, or a poisoned container image — gains the ability to load BPF programs with the runner’s capabilities. The runner’s elevated BPF access becomes a privilege escalation primitive if the kernel verifier is unpatched.
The blast radius of a successful verifier bypass is unbounded: arbitrary kernel read/write means the attacker can extract secrets from kernel memory (including other tenants’ process memory via /proc/kcore or direct mapping), modify kernel credentials for any process, install a kernel-level rootkit, or disable security mechanisms (SELinux enforcement mode, audit rules) by patching kernel data structures in place.
Configuration / Implementation
Disabling Unprivileged BPF
The most effective single mitigation is preventing unprivileged processes from loading BPF programs. The kernel.unprivileged_bpf_disabled sysctl controls this:
- Value
0(default on most distributions): any unprivileged user can callbpf(BPF_PROG_LOAD, ...). - Value
1: unprivileged BPF is disabled, but root can reset this to0at runtime. - Value
2: unprivileged BPF is permanently disabled — no process can re-enable it without a reboot, even root.
Always set value 2 on production systems. Value 1 provides weak protection because any process that achieves root (through another vulnerability) can re-enable unprivileged BPF before loading a BPF exploit.
# Apply immediately
sysctl -w kernel.unprivileged_bpf_disabled=2
# Verify
sysctl kernel.unprivileged_bpf_disabled
# Expected: kernel.unprivileged_bpf_disabled = 2
# Persist across reboots
cat > /etc/sysctl.d/90-bpf-hardening.conf << 'EOF'
# Permanently disable unprivileged BPF loading.
# Value 2 cannot be reset to 0 without reboot, unlike value 1.
kernel.unprivileged_bpf_disabled = 2
# Harden BPF JIT: enable constant blinding, mitigate JIT spraying.
kernel.bpf_jit_harden = 2
# Hide JIT-compiled program symbols from non-root.
kernel.bpf_jit_kallsyms = 0
EOF
sysctl -p /etc/sysctl.d/90-bpf-hardening.conf
BPF JIT Hardening
Even when BPF programs require privilege to load, the JIT compiler introduces additional attack surface. kernel.bpf_jit_harden=2 enables constant blinding (replacing immediate constants in JIT-compiled code with XOR-masked values, defeating JIT spraying attacks) and additional JIT mitigations. kernel.bpf_jit_kallsyms=0 prevents non-root from reading JIT-compiled symbol addresses via /proc/kallsyms, limiting an attacker’s ability to locate JIT-compiled code in kernel address space.
# Confirm JIT is enabled (value 1 or 2)
cat /proc/sys/net/core/bpf_jit_enable
# Confirm harden level
sysctl kernel.bpf_jit_harden
# Expected: kernel.bpf_jit_harden = 2
# Confirm JIT symbols hidden from non-root
sysctl kernel.bpf_jit_kallsyms
# Expected: kernel.bpf_jit_kallsyms = 0
# Confirm in dmesg after boot
dmesg | grep -i "bpf jit"
Auditing Processes with BPF Capabilities
kernel.unprivileged_bpf_disabled=2 protects against unprivileged users, but processes holding CAP_BPF or CAP_SYS_ADMIN can still load BPF programs. Audit the capability surface regularly:
# Check capabilities of a specific process
getpcaps <pid>
# Find all processes with CAP_BPF or CAP_SYS_ADMIN
for pid in /proc/[0-9]*/status; do
caps=$(grep -E "^CapEff:" "$pid" 2>/dev/null | awk '{print $2}')
if [ -n "$caps" ]; then
# CAP_BPF = bit 39 (0x8000000000), CAP_SYS_ADMIN = bit 21 (0x200000)
dec_caps=$(printf "%d" "0x$caps" 2>/dev/null)
if (( (dec_caps & (1 << 39)) || (dec_caps & (1 << 21)) )); then
echo "PID $(basename $(dirname $pid)): $(cat $(dirname $pid)/comm 2>/dev/null) has CAP_BPF or CAP_SYS_ADMIN"
fi
fi
done
In Kubernetes, identify pods with elevated BPF capabilities:
# Find pods adding BPF or SYS_ADMIN capabilities
kubectl get pods --all-namespaces -o json | jq -r '
.items[] |
. as $pod |
.spec.containers[] |
select(
.securityContext.capabilities.add? |
arrays |
any(. == "BPF" or . == "SYS_ADMIN")
) |
[$pod.metadata.namespace, $pod.metadata.name, .name] |
join("/")
'
Kubernetes Pod Security Admission (PSA) restricted profile prohibits both CAP_BPF and CAP_SYS_ADMIN. Apply it to namespaces that do not run observability DaemonSets:
# Label namespace to enforce restricted PSA profile
kubectl label namespace <namespace> \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=latest
BPF Program Allowlisting with LSM Hooks
For environments that require BPF access for observability tools but want to restrict which processes can load programs, a BPF LSM policy can allowlist by UID, PID namespace, or binary path. The LSM hook bpf fires before every BPF syscall:
// Example BPF LSM program restricting BPF_PROG_LOAD to specific UIDs
SEC("lsm/bpf")
int BPF_PROG(restrict_bpf_load, int cmd, union bpf_attr *attr, unsigned int size)
{
__u32 uid = bpf_get_current_uid_gid() & 0xffffffff;
if (cmd == BPF_PROG_LOAD) {
// Only allow UID 0 and the observability service UID (e.g., 1001)
if (uid != 0 && uid != ALLOWED_OBS_UID) {
return -EPERM;
}
}
return 0;
}
On Ubuntu kernels with AppArmor BPF mediation, the AppArmor profile can additionally restrict which profiles may perform BPF map operations via kernel.bpf_map_permission controls. Consult Ubuntu’s AppArmor BPF documentation for profile syntax.
Monitoring the BPF Kernel Tree for Verifier Fixes
Track upstream verifier changes before they acquire CVEs or appear in distribution kernels:
# Clone the bpf stable tree (one-time setup)
git clone https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git /opt/bpf-kernel-tree
cd /opt/bpf-kernel-tree
# Check for recent verifier commits (run weekly or via cron)
git fetch origin 2>/dev/null
git log --oneline --since="30 days ago" -- kernel/bpf/verifier.c kernel/bpf/tnum.c
Automated alerting script (/opt/bpf-monitor/check-verifier.sh):
#!/usr/bin/env bash
set -euo pipefail
REPO_DIR="/opt/bpf-kernel-tree"
ALERT_EMAIL="security@example.com"
STATE_FILE="/opt/bpf-monitor/last-seen-commit"
cd "$REPO_DIR"
git fetch origin --quiet
LATEST=$(git rev-parse origin/master)
LAST_SEEN=$(cat "$STATE_FILE" 2>/dev/null || echo "")
if [ "$LATEST" != "$LAST_SEEN" ]; then
NEW_COMMITS=$(git log --oneline "${LAST_SEEN:+${LAST_SEEN}..}origin/master" \
-- kernel/bpf/verifier.c kernel/bpf/tnum.c 2>/dev/null || \
git log --oneline --since="7 days ago" origin/master \
-- kernel/bpf/verifier.c kernel/bpf/tnum.c)
if [ -n "$NEW_COMMITS" ]; then
echo "$NEW_COMMITS" | mail -s "[BPF VERIFIER ALERT] New commits in bpf tree" "$ALERT_EMAIL"
fi
echo "$LATEST" > "$STATE_FILE"
fi
# Run weekly via cron
echo "0 8 * * 1 root /opt/bpf-monitor/check-verifier.sh" > /etc/cron.d/bpf-verifier-monitor
chmod 644 /etc/cron.d/bpf-verifier-monitor
# Cross-reference with oss-security disclosures
# Subscribe: https://www.openwall.com/lists/oss-security/
# Query OSV for BPF kernel CVEs:
curl -s 'https://api.osv.dev/v1/query' \
-H 'Content-Type: application/json' \
-d '{"package": {"name": "linux", "ecosystem": "Linux"}}' | \
jq '.vulns[] | select(.id | startswith("CVE")) | {id, summary: .summary}' 2>/dev/null | head -40
Blocking the BPF Syscall with Seccomp
For workloads that do not use BPF at all, block the bpf syscall entirely via seccomp. The syscall number is 321 on x86-64.
Seccomp profile JSON (save as /etc/seccomp/no-bpf.json):
{
"defaultAction": "SCMP_ACT_ALLOW",
"architectures": ["SCMP_ARCH_X86_64", "SCMP_ARCH_AARCH64"],
"syscalls": [
{
"names": ["bpf"],
"action": "SCMP_ACT_ERRNO",
"errnoRet": 1
}
]
}
Apply to a container at runtime:
# Docker
docker run --security-opt seccomp=/etc/seccomp/no-bpf.json <image>
# Kubernetes pod spec — use RuntimeDefault which blocks BPF in most CRI implementations
# or reference a custom profile via a SeccompProfile object
Kubernetes pod seccomp configuration:
apiVersion: v1
kind: Pod
metadata:
name: hardened-workload
spec:
securityContext:
seccompProfile:
type: RuntimeDefault # Blocks bpf syscall for most workloads
containers:
- name: app
image: app:latest
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
For observability DaemonSets that legitimately need BPF (Cilium, Tetragon, Falco), use seccompProfile.type: Unconfined with a tightly scoped CAP_BPF-only capability grant, and ensure the host kernel is patched:
securityContext:
seccompProfile:
type: Unconfined
capabilities:
add: ["BPF", "PERFMON"]
drop: ["ALL"]
Testing Verifier Hardening in CI
Run the kernel’s own BPF selftests against the hardened configuration:
# Build and run BPF verifier selftests (requires kernel source)
make -C tools/testing/selftests/bpf run_tests 2>&1 | grep -E "(PASS|FAIL|ERROR)"
# Verify JIT blinding is active via bpftrace
bpftrace -e 'BEGIN { printf("JIT active\n"); exit(); }'
# Check JIT harden level reported by kernel
dmesg | grep -i "bpf jit"
# Confirm unprivileged BPF is blocked — run as non-root
sudo -u nobody bpftool prog load /dev/null /sys/fs/bpf/test 2>&1
# Expected: Operation not permitted
Expected Behaviour
| Signal | Unpatched kernel, BPF unrestricted | Patched + hardened |
|---|---|---|
Unprivileged user calls bpf(BPF_PROG_LOAD, ...) |
Program accepted by verifier; loads successfully | EPERM immediately; kernel.unprivileged_bpf_disabled=2 blocks the call before the verifier runs |
Crafted verifier bypass program (GHSA-hfqc-63c7-rj9f pattern) submitted by CAP_BPF process |
Verifier incorrectly declares safe; runtime achieves kernel arbitrary read/write | Patched kernel: verifier correctly rejects due to fixed 32-bit range widening logic; EACCES returned |
Non-root process reads /proc/kallsyms for JIT symbols |
JIT-compiled BPF program addresses visible; useful for gadget location | kernel.bpf_jit_kallsyms=0 suppresses JIT symbols; non-root reads zeros |
Container with CAP_BPF loads an observability BPF program |
Loads successfully; if kernel is unpatched, verifier bypass yields host kernel access | Loads successfully (CAP_BPF permitted); if kernel is patched, bypass rejected; monitor audit log for unexpected BPF loads outside known DaemonSet PIDs |
New commit to kernel/bpf/verifier.c in bpf upstream tree |
No alert; operator unaware until distribution ships kernel update weeks later | Monitoring script detects new commit within 24 hours; alert triggers; operator evaluates diff and assesses urgency against distribution patch ETA |
Trade-offs
| Aspect | Benefit | Cost | Mitigation |
|---|---|---|---|
kernel.unprivileged_bpf_disabled=2 |
Eliminates the primary exploitation path for all verifier bugs; unprivileged users cannot reach the verifier | Breaks any unprivileged eBPF use: bpftrace without sudo, unprivileged observability agents, some eBPF-based network tools in user namespaces |
Run observability tools as root or with CAP_BPF; audit which tools actually require unpriv BPF — most production tools run privileged anyway |
kernel.bpf_jit_harden=2 |
Constant blinding defeats JIT spraying; mitigates code-reuse attacks against JIT-compiled BPF programs | 5–15% performance overhead on JIT-compiled BPF programs in hot paths (high-frequency XDP or TC programs may be affected) | Benchmark before enforcing in latency-sensitive environments; consider bpf_jit_harden=1 (user programs only) as a middle ground |
Seccomp BPF syscall block (SCMP_ACT_ERRNO on bpf) |
Completely eliminates BPF attack surface for workloads that do not need BPF | Breaks Cilium, Falco, Tetragon, and any eBPF-based observability or networking tool in the affected container or host | Apply only to workloads that explicitly do not use BPF; use a separate seccomp profile for observability DaemonSets; never apply to Cilium or CNI plugin pods |
| Kernel update cadence for BPF verifier fixes | Applying distribution kernel updates promptly closes the vulnerability window | Kernel updates require node drain and reboot in Kubernetes environments; disrupts running workloads; maintenance windows constrain frequency | Use live-patching where available (kpatch on RHEL, livepatch on Ubuntu) for critical verifier fixes; automate kernel update testing in a staging cluster |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
Cilium or Falco fails after setting kernel.unprivileged_bpf_disabled=2 |
Cilium agent CrashLoopBackOff; Falco fails to load probes; DaemonSet pods restart repeatedly | kubectl logs -n kube-system <cilium-pod> shows Operation not permitted on BPF prog load; systemctl status falco shows probe load failure |
Expected for unprivileged configurations: Cilium and Falco require privileged BPF. Verify they run as root with CAP_BPF; unprivileged_bpf_disabled=2 does not affect privileged BPF loads. Confirm pods have securityContext.capabilities.add: ["BPF"] and are not running as non-root UID |
JIT hardening (bpf_jit_harden=2) breaks existing BPF maps after major kernel update |
BPF programs return unexpected errors after kernel upgrade; map lookups fail; XDP programs drop packets incorrectly | `dmesg | grep -i bpfshows JIT errors;bpftool prog list` shows programs in error state; application-level packet drops or metric gaps |
| Seccomp BPF block breaks observability DaemonSet | Falco or Tetragon DaemonSet pods fail to start; node-level visibility gaps; security alerts stop arriving from affected nodes | kubectl describe pod <falco-pod> shows execve failed: Operation not permitted or BPF syscall error; Falco dashboard shows node as offline |
Apply a separate seccomp profile for the observability DaemonSet that permits the bpf syscall; do not use a blanket host-level seccomp profile that blocks BPF; use Kubernetes SeccompProfile objects scoped to specific pods |
| Monitoring script generates false positives on non-security verifier commits | Alert fatigue; operators begin ignoring BPF verifier alerts | High alert volume; commits being flagged are performance fixes, test additions, or documentation changes in verifier.c |
Tune the monitoring script to filter commit subjects: add a grep for terms associated with security fixes (range, tnum, ALU, scalar, ptr, unsafe, bypass); require manual triage for all alerts but reduce noise with subject filtering; cross-reference with OSV and oss-security before escalating |