Real-Time CVE Exposure Assessment with eBPF: Is This Kernel Bug Actually Reachable?

Real-Time CVE Exposure Assessment with eBPF: Is This Kernel Bug Actually Reachable?

The Problem

When a kernel CVE drops — CVSSv3 9.8, remote code execution, affects kernels 5.15 through 6.10 — the first question defenders ask is “are we exposed?” The traditional answer takes days: wait for your scanner vendor to update signatures, wait for the OS vendor to publish an advisory with mitigations, wait for your asset inventory to sync. In the meantime, LLM-assisted exploit development can produce a working proof-of-concept within hours of the CVE description being public.

The question “are we exposed?” actually has two sub-questions that defenders routinely conflate:

  1. Is the vulnerable code present? (Kernel version in range, module compiled in, config option enabled.)
  2. Is the vulnerable code reachable? (Is any process actually calling the vulnerable path on this specific host right now?)

Standard vulnerability scanners answer question 1. They read /proc/version, compare against a CVE’s affected-range list, and report “vulnerable.” But many kernel CVEs are only exploitable through specific subsystems — io_uring, SCTP, NFC, USB gadget drivers — that are never called on a given production host despite being present. A host that is “vulnerable” by scanner criteria may have zero actual call-path exposure.

eBPF lets you answer question 2 in real time without rebooting or patching. You attach a kprobe or tracepoint to the vulnerable function, collect call stacks and process names for 60 seconds, and learn whether any workload on the host is exercising the code path. If nothing calls it, you can deprioritise that host in the patch queue and focus on hosts where the path is live.

This approach is especially valuable when:

  • CVE patch lag means you cannot patch the same day
  • Hosts have different workload profiles (some run USB gadget stacks, some don’t)
  • You need to demonstrate to an auditor which hosts are priority-1 vs priority-3

Target systems: Linux 5.8+ with CONFIG_BPF_SYSCALL=y; hosts with bpftool, bpftrace, or Python BCC installed; production hosts where patching within 24 hours is not feasible for all systems.

Threat Model

1. Attacker with network access exploiting a recently-published CVE (external). Objective: exploit a kernel vulnerability before patches are deployed across the fleet. Impact: kernel-level code execution; full host compromise. Likelihood increases with every hour the patch is not deployed.

2. Insider attacker with unprivileged shell access (authenticated low-privilege user). Objective: trigger a local privilege escalation via a CVE in a subsystem that the host exposes to unprivileged users (io_uring, namespace operations, perf_event). Impact: full root compromise.

3. Container escape via vulnerable kernel path (attacker with code execution in a container). Objective: exploit a CVE in the kernel’s namespace or seccomp handling from within a container. Impact: host compromise, lateral movement to other containers.

The exposure-assessment approach does not reduce the vulnerability itself, but it enables accurate prioritisation: identify which hosts have live call paths into the vulnerable code, and ensure those hosts get patches in the first wave rather than the second.

Hardening Configuration

Step 1: Identify the Vulnerable Function from the CVE Description

Most kernel CVEs identify the vulnerable function directly in the commit message or advisory. For CVEs that don’t, search the git log:

# Search the kernel git log for the CVE identifier or affected file
git -C /usr/src/linux log --oneline --all | grep "CVE-2026-XXXXX"

# Or search by the advisory description keywords
git -C /usr/src/linux log --oneline --all -- \
  "net/ipv4/tcp_input.c" | head -20

# The fix commit message typically names the vulnerable function
git -C /usr/src/linux show <fix-commit-sha> --stat

For public CVEs, the NVD page often links the fix commit. The vulnerable function name is found in the diff.

Step 2: Attach a kprobe with bpftrace

# Attach a kprobe to the vulnerable function and log callers
# Example: CVE in tcp_ack_update_rtt()

bpftrace -e '
kprobe:tcp_ack_update_rtt {
  @calls[comm, pid] = count();
}

interval:s:60 {
  print("=== CVE exposure summary (60s window) ===");
  print(@calls);
  clear(@calls);
  exit();
}
'

Output interpretation:

  • Empty @calls after 60 seconds: no exposure on this host. The code path is not being called.
  • Non-empty: inspect the process names (comm) to understand which workloads are triggering it.

For more detailed call-stack attribution:

bpftrace -e '
kprobe:tcp_ack_update_rtt {
  @[comm, ustack()] = count();
}

interval:s:30 {
  print(@);
  clear(@);
  exit();
}
'

Step 3: Structured Assessment Script

Wrap the assessment into a reusable script that produces machine-readable output for fleet-wide use:

#!/bin/bash
# cve-exposure-check.sh
# Usage: ./cve-exposure-check.sh <kernel_function> <cve_id> [duration_seconds]

FUNC=${1:?Usage: $0 <function> <cve_id> [duration]}
CVE=${2:?}
DURATION=${3:-60}
HOSTNAME=$(hostname -f)
KERNEL=$(uname -r)

echo "Checking CVE ${CVE} exposure via ${FUNC} on ${HOSTNAME} (kernel ${KERNEL})"

# Verify the function exists in the running kernel
if ! grep -q "^${FUNC} " /proc/kallsyms 2>/dev/null; then
  echo '{"hostname":"'${HOSTNAME}'","cve":"'${CVE}'","function":"'${FUNC}'","status":"not_present","exposed":false}'
  exit 0
fi

RESULT=$(bpftrace -e "
kprobe:${FUNC} {
  @calls[comm, pid] = count();
}
interval:s:${DURATION} {
  print(@calls);
  exit();
}
" 2>/dev/null)

# Count unique callers
CALLER_COUNT=$(echo "$RESULT" | grep -c "@calls\[" || true)

if [ "$CALLER_COUNT" -eq 0 ]; then
  EXPOSED=false
  CALLERS="none"
else
  EXPOSED=true
  CALLERS=$(echo "$RESULT" | grep "@calls\[" | awk -F'[\\[\\]]' '{print $2}' | paste -sd,)
fi

cat <<EOF
{
  "hostname": "${HOSTNAME}",
  "cve": "${CVE}",
  "function": "${FUNC}",
  "kernel": "${KERNEL}",
  "assessment_duration_s": ${DURATION},
  "exposed": ${EXPOSED},
  "callers": "${CALLERS}"
}
EOF

Step 4: Fleet-Wide Assessment via Ansible

# playbook-cve-exposure.yml
---
- name: Assess CVE exposure across fleet
  hosts: all
  become: true
  vars:
    cve_id: "CVE-2026-XXXXX"
    vulnerable_function: "tcp_ack_update_rtt"
    assessment_duration: 60

  tasks:
    - name: Copy assessment script
      copy:
        src: cve-exposure-check.sh
        dest: /tmp/cve-exposure-check.sh
        mode: "0755"

    - name: Run exposure assessment
      command: >
        /tmp/cve-exposure-check.sh
        {{ vulnerable_function }}
        {{ cve_id }}
        {{ assessment_duration }}
      register: exposure_result
      changed_when: false

    - name: Collect results
      set_fact:
        cve_exposure: "{{ exposure_result.stdout | from_json }}"

    - name: Flag exposed hosts
      debug:
        msg: "EXPOSED: {{ inventory_hostname }} — callers: {{ cve_exposure.callers }}"
      when: cve_exposure.exposed | bool

Run and produce a prioritised patch list:

ansible-playbook playbook-cve-exposure.yml \
  -e cve_id=CVE-2026-XXXXX \
  -e vulnerable_function=tcp_ack_update_rtt \
  --output-file /tmp/exposure-results.json

# Extract exposed hosts, sorted by number of callers
jq -s 'sort_by(.callers | split(",") | length) | reverse |
  .[] | select(.exposed) | {hostname, callers}' \
  /tmp/exposure-results.json

Step 5: Continuous Monitoring Until Patched

For the duration between CVE publication and patch deployment, run the assessment continuously and alert if a previously-unexposed host starts calling the vulnerable function:

# Prometheus pushgateway integration
while true; do
  RESULT=$(./cve-exposure-check.sh "${FUNC}" "${CVE}" 30)
  EXPOSED=$(echo "$RESULT" | jq -r '.exposed')
  HOSTNAME=$(hostname -f)

  # Push metric to Prometheus pushgateway
  cat <<EOF | curl -s --data-binary @- \
    "http://pushgateway:9091/metrics/job/cve_exposure/instance/${HOSTNAME}"
# HELP cve_exposed Whether this host has active call paths to vulnerable function
# TYPE cve_exposed gauge
cve_exposed{cve="${CVE}",function="${FUNC}",host="${HOSTNAME}"} $( [ "$EXPOSED" = "true" ] && echo 1 || echo 0 )
EOF
  sleep 60
done

Alert rule:

# prometheus alert
- alert: NewCVEExposureDetected
  expr: cve_exposed == 1
  for: 0m
  labels:
    severity: critical
  annotations:
    summary: "Host {{ $labels.host }} newly exposed to {{ $labels.cve }}"
    action: "Escalate to patch priority-1; investigate callers"

Handling Functions Not Exported in kallsyms

Some vulnerable functions are inlined by the compiler and don’t appear as symbols:

# Check if function is present
grep "^${FUNC} " /proc/kallsyms

# If absent, probe the caller instead
# Find callers from the patch diff
git -C /usr/src/linux show <fix-commit> | grep "^-.*${FUNC}" | head -5

# Or use the BCC trace tool to search for callers
bpftrace -l "kprobe:*tcp*" | grep -i "ack"

Expected Behaviour After Hardening

Scenario Without Assessment With eBPF Assessment
CVE published at 09:00 All “vulnerable” hosts queued for patches equally Exposed hosts identified by 09:05; non-exposed hosts deprioritised
40% of fleet exposed Unknown; patch all 100% in first wave Patch exposed 40% within 4 hours; remaining 60% in next maintenance window
Workload change makes previously-unexposed host exposed No detection Continuous monitor alert fires within 60 seconds
Inlined vulnerable function Assessment fails gracefully Reports “not_present”; caller probing used as fallback
Patch deployed Assessment stops showing callers Metric drops to 0; alert clears automatically

Trade-offs and Operational Considerations

Aspect Benefit Cost Mitigation
kprobe at vulnerable function Minimal overhead; ~ns per call Requires CAP_BPF or root; may not work with locked-down kernels Run as a dedicated security monitoring account with CAP_BPF scoped via ambient capabilities
60-second sampling window Short enough to be actionable Low-frequency callers may be missed Run assessment during peak traffic; extend window for low-traffic periods
Fleet-wide Ansible playbook Single-pane exposure view Ansible inventory must be current; playbook adds SSH load Run against static inventory snapshot; use parallel forks
Continuous pushgateway loop Real-time exposure metric Extra probe overhead on exposed hosts Reduce polling to 5-minute interval after initial 1-hour intensive monitoring

Failure Modes

Failure Symptom Detection Recovery
bpftrace not installed Assessment script exits with “command not found” Check script exit code; log missing tool Pre-install bpftrace or BCC via Ansible before assessment run
Kernel built with CONFIG_KPROBES=n kprobe attach fails silently Script reports “not_present” even for present function Check /proc/sys/kernel/kprobes_all_enabled; fall back to kallsyms-only check
Function inlined; kprobe unavailable Assessment falsely reports “not_present” Cross-reference with kernel build CONFIG_OPTIMIZE_INLINING Probe the non-inlined caller; document limitation in results
Assessment script resource contention on overloaded host eBPF program load fails Error in bpftrace output Retry after load drops; reduce other bpftrace programs running simultaneously
Attacker triggers vulnerability after assessment window Host assessed as unexposed; exploitation occurs Post-incident timeline vs assessment timestamps Run continuous monitoring for high-CVSS CVEs until patched