Detecting and Containing eBPF-Based Rootkits That Blind Your Observability Stack
Problem
eBPF observability tools — Falco, Tetragon, Cilium’s Hubble, and custom BPF programs — have become the dominant method for kernel-level security monitoring in Linux environments. They are low-overhead, production-safe, and capable of tracing system calls, network connections, and process executions with sub-millisecond latency. The assumption underlying all of them is that the BPF programs running in the kernel have exclusive, unfiltered visibility into kernel events.
eBPF rootkits break that assumption. Because eBPF programs run in kernel space alongside the observability stack, a sufficiently privileged attacker who loads their own BPF programs can intercept the same kernel hooks that security tools use — and filter out their own activity before it reaches user space. The rootkit and the security tool are competing programs in the same BPF subsystem, and the rootkit moves second.
Published eBPF rootkit toolkits demonstrate this concretely:
ebpfkit (2021, Guillaume Fournier/Datadog research): hooks sys_read and sys_write to hide files and processes from user space tools; hooks tc network points to filter network telemetry; hooks bpf() syscall itself to hide BPF maps from bpftool inspection.
bad-bpf (2022, PatH): a collection of eBPF programs demonstrating TTY hijacking, process hiding via getdents64 hooking, and privilege escalation via bpf_probe_write_user.
TripleCross (2022): eBPF rootkit using Linux kernel runtime hijacking techniques including task structure manipulation to make processes invisible to ps, top, and /proc traversal.
The attack pattern is consistent across implementations:
- Load BPF programs with
CAP_BPF(Linux 5.8+) orCAP_SYS_ADMIN. - Hook
kprobesorfentry/fexiton the same kernel functions that Falco/Tetragon monitor. - Filter events: if the caller matches the rootkit’s process, return early before the security tool’s BPF program fires or before the data reaches user space ringbuffers.
- Optionally hook
bpf()syscall to hide the rootkit’s own BPF programs frombpftool prog list.
The effectiveness of this approach depends on two structural weaknesses in current BPF security tooling. First, most tools have no mechanism to detect that another BPF program is modifying the event stream they are reading. They receive filtered data and report it faithfully. Second, the unprivileged BPF restrictions (kernel.unprivileged_bpf_disabled) block unauthorized loading but do not protect against a process that has already obtained CAP_BPF through a preceding privilege escalation.
Kubernetes environments add complexity: Falco and Tetragon typically run as DaemonSets with CAP_SYS_ADMIN or CAP_BPF. A container escape that achieves code execution as root on the node gains the same capability level as the security tool and can load competing BPF programs.
Target systems: Linux 5.8–6.12 with eBPF-based security monitoring (Falco ≥0.32, Tetragon ≥0.10, Cilium ≥1.12, custom BPF agents); Kubernetes nodes with DaemonSet-based observability; any host where CAP_BPF or CAP_SYS_ADMIN can be obtained by an attacker.
Threat Model
Adversary 1 — Container escape with host root. Access level: root on the Kubernetes node after a container escape. Objective: load eBPF rootkit programs to blind Falco and Tetragon before performing lateral movement, preventing any alert from firing during the attack.
Adversary 2 — Compromised privileged DaemonSet. Access level: code execution inside a DaemonSet pod that runs with CAP_BPF (e.g., a compromised custom monitoring agent). Objective: use the pod’s existing BPF capability to load rootkit programs that hide the attacker’s subsequent activity.
Adversary 3 — Kernel LPE followed by BPF rootkit. Access level: initial unprivileged shell, then kernel privilege escalation (e.g., via n_gsm or io_uring UAF). Objective: load BPF rootkit after achieving root to evade post-compromise detection.
Adversary 4 — Supply chain compromise of BPF agent. Access level: control over the container image or binary for an eBPF monitoring tool. Objective: replace the monitoring agent with a version that selectively suppresses alerts while appearing to function normally.
Without hardening: eBPF rootkit loads undetected; security tools report clean telemetry; attacker moves freely. With hardening: BPF program loading triggers alerts; BPF map count anomaly is detected; kernel lockdown restricts BPF loading; secondary observability layer provides independent view.
Configuration / Implementation
Step 1 — Restrict unprivileged BPF loading
# /etc/sysctl.d/90-bpf-hardening.conf
# Block all BPF program loading by unprivileged users
kernel.unprivileged_bpf_disabled = 1
# Enable BPF JIT hardening (prevents ROP chains in BPF programs)
net.core.bpf_jit_harden = 2
# Restrict BPF JIT kallsyms exposure
net.core.bpf_jit_kallsyms = 0
# Restrict kernel pointer exposure (limits info leak for exploit chains)
kernel.kptr_restrict = 2
sysctl --system
# Verify
sysctl kernel.unprivileged_bpf_disabled
# kernel.unprivileged_bpf_disabled = 1
kernel.unprivileged_bpf_disabled = 1 does not prevent root from loading BPF programs, but it eliminates the class of attacks where an unprivileged user loads a BPF rootkit before privilege escalation.
Step 2 — Enable kernel lockdown to restrict BPF from unsigned code
# Check current lockdown mode
cat /sys/kernel/security/lockdown
# Enable lockdown=integrity at boot (blocks BPF from modifying kernel memory)
# Add to kernel command line in /etc/default/grub:
GRUB_CMDLINE_LINUX="lockdown=integrity"
update-grub
# Or on systems using systemd-boot:
# Add lockdown=integrity to /boot/loader/entries/linux.conf options line
lockdown=integrity prevents bpf_probe_write_user calls (used by TripleCross for privilege escalation), blocks /dev/mem and /proc/kcore access, and prevents unsigned kernel module loading. It does not prevent legitimate BPF programs from being loaded by root — it blocks the specific primitives that BPF rootkits use to modify kernel memory.
Note: lockdown=confidentiality is more restrictive (blocks all BPF kprobe writes) but may break some legitimate monitoring tools. Start with integrity.
Step 3 — Monitor BPF program loading and map creation
Use Tetragon or Falco to alert on any new BPF program loading — including BPF programs loading other BPF programs:
Tetragon TracingPolicy:
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: monitor-bpf-loading
spec:
kprobes:
- call: "security_bpf_prog_load"
syscall: false
args:
- index: 0
type: "int"
selectors:
- matchPIDs:
- operator: NotIn
values:
# Exempt known legitimate BPF agents (add your PIDs or use matchNamespaces)
- 1
matchActions:
- action: Sigkill # Kill unexpected BPF loader; adjust to Post for audit-only
- matchActions:
- action: Post # Log all BPF program loads
Falco rule:
- rule: BPF Program Loaded
desc: A BPF program was loaded — alert if from unexpected process
condition: >
evt.type = bpf and
evt.dir = > and
not proc.name in (falco, tetragon, cilium-agent, bpftool, node_exporter) and
not container.name in (falco, tetragon-agent)
output: >
BPF program loaded by unexpected process
(proc=%proc.name pid=%proc.pid user=%user.name
container=%container.name image=%container.image.repository)
priority: CRITICAL
tags: [ebpf, rootkit, kernel]
Step 4 — Implement BPF program inventory baseline
Run a periodic check of loaded BPF programs and alert on unexpected additions:
#!/bin/bash
# /usr/local/bin/bpf-inventory-check.sh
# Run this as a cron job or via your monitoring stack
BASELINE_FILE=/var/lib/bpf-inventory/baseline.json
CURRENT_FILE=/tmp/bpf-current-$(date +%s).json
# Capture current BPF program list
bpftool prog list -j > "$CURRENT_FILE"
if [[ ! -f "$BASELINE_FILE" ]]; then
mkdir -p /var/lib/bpf-inventory
cp "$CURRENT_FILE" "$BASELINE_FILE"
echo "Baseline established with $(jq length "$BASELINE_FILE") programs"
exit 0
fi
# Compare: look for programs added since baseline
NEW_PROGRAMS=$(jq -r --slurpfile baseline "$BASELINE_FILE" '
. as $current |
($baseline[0] | map(.id)) as $baseline_ids |
$current[] | select(.id as $id | $baseline_ids | index($id) == null) |
"\(.id) \(.type) \(.name // "unnamed") loaded_at=\(.loaded_at // "unknown")"
' "$CURRENT_FILE")
if [[ -n "$NEW_PROGRAMS" ]]; then
echo "ALERT: New BPF programs detected since baseline:"
echo "$NEW_PROGRAMS"
# Send to your alerting system
# curl -X POST $ALERT_WEBHOOK -d "{\"text\": \"New BPF programs: $NEW_PROGRAMS\"}"
else
echo "BPF program count unchanged"
fi
rm "$CURRENT_FILE"
chmod +x /usr/local/bin/bpf-inventory-check.sh
# Run every 5 minutes
echo "*/5 * * * * root /usr/local/bin/bpf-inventory-check.sh >> /var/log/bpf-inventory.log 2>&1" \
> /etc/cron.d/bpf-inventory
Step 5 — Detect BPF hook hiding via cross-layer comparison
A BPF rootkit that hides processes from getdents64 will show a discrepancy between the kernel’s view and a non-BPF view. Implement cross-layer process enumeration:
#!/bin/bash
# /usr/local/bin/process-consistency-check.sh
# Compare processes visible via /proc vs those seen by bpftrace
# Enumerate PIDs via /proc (may be filtered by BPF hook)
PROC_PIDS=$(ls /proc | grep '^[0-9]' | sort -n)
# Enumerate PIDs via bpftrace kprobe on sys_getpid (bypasses getdents64 hook)
# This uses a different kernel path than /proc traversal
BPFTRACE_PIDS=$(bpftrace -e '
tracepoint:syscalls:sys_enter_getpid { printf("%d\n", pid); }
' --timeout 5 2>/dev/null | sort -u -n)
# Find PIDs visible to bpftrace but not /proc (hidden processes)
HIDDEN=$(comm -23 \
<(echo "$BPFTRACE_PIDS") \
<(echo "$PROC_PIDS"))
if [[ -n "$HIDDEN" ]]; then
echo "ALERT: PIDs active but hidden from /proc: $HIDDEN"
fi
Note: a sophisticated rootkit can also hook bpftrace. True independence requires hardware-based monitoring (Intel PT, PMU) or a hypervisor-level view. This script detects naive rootkits that only hook getdents64.
Step 6 — Deploy a secondary observability layer outside BPF
The most reliable defence is a monitoring layer that does not share the same kernel hook points as the primary BPF security tool:
Option A — auditd (kernel audit subsystem, separate from BPF):
# /etc/audit/rules.d/90-bpf-monitor.rules
# Monitor bpf() syscall
-a always,exit -F arch=b64 -S bpf -F key=bpf_syscall
# Monitor module loading (adjacent to BPF loading)
-a always,exit -F arch=b64 -S finit_module -S init_module -F key=module_load
# Monitor ptrace (used by some rootkit install paths)
-a always,exit -F arch=b64 -S ptrace -F key=ptrace_call
augenrules --load
systemctl restart auditd
# Verify BPF syscall is being audited
ausearch -k bpf_syscall --start today | head -20
auditd operates via a separate kernel hook (audit_log_* functions) that is harder to blind with a BPF program than the standard kprobe/tracepoint paths used by Falco/Tetragon.
Option B — eBPF program loading via BTF CO-RE from a separate kernel module:
For organizations that can deploy custom kernel modules with module signing, a loadable kernel module that registers LSM hooks for security_bpf_prog_load provides a monitoring path that a BPF-only rootkit cannot intercept.
Step 7 — Enforce Seccomp to block bpf() in workload containers
For application containers that have no legitimate need to load BPF programs, block the bpf() syscall via Seccomp:
# For all production workloads — add to pod spec
securityContext:
seccompProfile:
type: RuntimeDefault
The RuntimeDefault Seccomp profile blocks bpf() in containerd and Docker runtimes. Verify:
# Check which syscalls RuntimeDefault blocks on your runtime
cat /var/lib/kubelet/seccomp/profiles/audit.json | \
jq '.syscalls[] | select(.names[] | test("bpf"))'
Expected Behaviour
| Signal | Before hardening | After hardening |
|---|---|---|
bpftool prog list shows unexpected programs |
Not alerted | BPF inventory check fires alert within 5 minutes |
kernel.unprivileged_bpf_disabled |
0 (default) |
1 |
lockdown mode |
none |
integrity |
| New BPF program load from non-system process | No alert from Falco/Tetragon | Falco CRITICAL alert fires |
bpf() syscall in application container |
Permitted | Blocked by RuntimeDefault Seccomp |
| auditd logs bpf() calls | Not configured | All bpf() syscalls logged with uid, pid, comm |
Verification:
# Confirm BPF inventory baseline is established
ls -la /var/lib/bpf-inventory/baseline.json
# Confirm auditd rule is active
auditctl -l | grep bpf_syscall
# -a always,exit -F arch=b64 -S bpf -F key=bpf_syscall
# Confirm lockdown mode
cat /sys/kernel/security/lockdown
# integrity [confidentiality]
# Attempt BPF load as unprivileged user
su -s /bin/bash nobody -c "bpftool prog load /tmp/test.bpf /sys/fs/bpf/test 2>&1"
# Expected: Error: bpf(BPF_PROG_LOAD): Operation not permitted
Trade-offs
| Aspect | Benefit | Cost | Mitigation |
|---|---|---|---|
lockdown=integrity |
Blocks BPF memory-write primitives used by rootkits | Breaks some legitimate kprobe uses; may block perf profiling |
Test on a non-production node first; lockdown=integrity is less restrictive than confidentiality |
| BPF inventory check | Detects new programs added post-baseline | Generates false positives on every kernel update or agent restart | Re-baseline after planned updates; exclude known-good program names |
| auditd BPF rule | Independent from BPF-based monitoring | High log volume on BPF-heavy systems | Rate-limit by uid; exempt known monitoring service accounts |
| Seccomp blocking bpf() in containers | Prevents container workloads from loading BPF | Breaks workloads that legitimately use BPF (rare, but e.g. some network tools) | Allowlist specific pods that need BPF via explicit Seccomp profile override |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
| Falco/Tetragon self-alerts on own BPF programs | Flood of false-positive alerts on startup | Alert volume spike at agent start | Add exemption for the monitoring agent’s own pod/namespace in Falco/Tetragon rules |
| lockdown=integrity breaks legacy monitoring tool | Tool fails to start; dmesg shows lockdown rejection | dmesg | grep "lockdown" shows blocked operation |
Update the tool; if not possible, run on a dedicated node without lockdown |
| BPF inventory baseline becomes stale after kernel upgrade | Every new BPF program fires an alert; alert fatigue | All alerts after upgrade reference known-good programs | Re-baseline post-upgrade as part of maintenance runbook |
| Sophisticated rootkit hooks both kprobes and auditd paths | No alert from either monitoring layer | Gap detected only via hypervisor-level or hardware tracing | Layer in VM introspection (e.g., KVM VMI); schedule periodic offline forensic analysis |
Related Articles
- eBPF LSM — building LSM hooks that enforce security policy alongside eBPF monitoring programs
- eBPF Tetragon — deploying Tetragon for kernel-level process and network tracing
- Falco Security Rules — writing and tuning Falco rules for runtime threat detection
- Linux LPE Defence in Depth — the privilege escalation paths that give an attacker the CAP_BPF needed to load rootkits
- File Integrity Monitoring — complementary monitoring layer that detects rootkit installation artifacts