Limiting NGINX Worker Process Blast Radius with OS-Level Controls

Problem

NGINX’s process model is deliberately simple: a master process runs as root to bind privileged ports (80, 443) and manage configuration; worker processes handle all request processing and run as an unprivileged user (nginx, www-data, or a custom account). The design intent is that even if a worker process is compromised, the attacker inherits only the unprivileged worker’s context — not root.

Recent NGINX CVEs demonstrate why this design deserves scrutiny rather than trust. CVE-2024-7347 (ngx_http_mp4_module heap buffer overflow), the QUIC module vulnerabilities CVE-2024-24989 and CVE-2024-24990, and earlier memory corruption bugs all target the worker process. When exploited, the attacker’s code runs as the NGINX worker user. On a typical deployment, that user can:

Read all files accessible to the worker — including SSL private keys, application configuration, and files that are world-readable on the host
Make outbound network connections to arbitrary destinations (no egress restriction by default)
Read /proc entries for other processes (though limited by ptrace restrictions)
Write to any directory writable by the worker user
Call almost any syscall — there is no Seccomp filter on NGINX workers by default
On systems with lax filesystem permissions, pivot to application secrets or credentials

The gap between “runs as unprivileged user” and “fully contained” is significant. OS-level controls that the NGINX process model does not provide by default include: syscall filtering (Seccomp), network namespace isolation, filesystem namespace restrictions, and capability bounding sets.

These controls matter most in the window between CVE disclosure and patch deployment. If your emergency patching SLA is 7 days for critical vulnerabilities, these controls are your defence for those 7 days.

Target systems: any Linux host running NGINX as a public-facing web server or reverse proxy; bare metal, VM, and non-container deployments where NGINX is managed via systemd or init scripts; this article focuses on non-containerised NGINX — containerised deployments have different tooling.

Threat Model

Adversary 1 — RCE via memory corruption CVE. A vulnerability in an NGINX module (mp4, QUIC, image_filter) allows an attacker to achieve code execution in a worker process. With baseline hardening: attacker can read SSL private keys, make outbound connections, and attempt further privilege escalation. With OS-level controls: attacker is restricted to a narrow syscall whitelist and cannot reach most of the filesystem or external network.

Adversary 2 — SSRF via proxy_pass misconfiguration. A misconfigured proxy_pass directive allows the NGINX worker to make requests to internal services on behalf of an attacker. Without network namespace isolation: the worker can reach internal services on any network interface. With namespace isolation: the worker is limited to the network interfaces explicitly shared.

Adversary 3 — Post-exploitation privilege escalation. After achieving worker-level code execution, an attacker attempts to escalate to root via a kernel LPE. Without Seccomp: all LPE syscall chains are reachable. With Seccomp: the restricted syscall set eliminates most common LPE primitives.

Configuration / Implementation

Step 1 — Baseline worker user configuration

Before adding OS-level controls, ensure the worker runs with minimal permissions:

# /etc/nginx/nginx.conf

# Dedicated worker user with no login shell and no home directory
user nginx nginx;

# Worker process count — one per CPU core
worker_processes auto;

# Limit worker connections
events {
    worker_connections 1024;
    use epoll;
}

# Create dedicated user if it doesn't exist
useradd --system --no-create-home --shell /bin/false --user-group nginx

# Verify the user has no sudo rights and no writable home
id nginx
# uid=xxx(nginx) gid=xxx(nginx) groups=xxx(nginx)

# Ensure SSL private keys are NOT readable by the worker user
ls -la /etc/ssl/private/nginx.key
# Should be: -rw-r----- root ssl-cert (not readable by nginx)
# nginx master reads the key before dropping privileges; workers never need direct access

Step 2 — Apply a Seccomp filter via systemd

The most impactful control is restricting which syscalls the NGINX worker can make. A compromised worker cannot call execve to spawn a shell, cannot call ptrace for memory scanning, and cannot reach kernel LPE primitives:

# /etc/systemd/system/nginx.service.d/seccomp-hardening.conf
[Service]
# Apply systemd's built-in Seccomp filtering

# Block all syscalls not in the web server group
SystemCallFilter=@system-service @network-io @file-system @io-event @signal @timer
SystemCallFilter=~@privileged @obsolete @reboot @swap @cpu-emulation @debug

# Specifically deny syscalls commonly used in kernel LPE exploits
SystemCallFilter=~ptrace process_vm_readv process_vm_writev userfaultfd

# Deny module-related syscalls
SystemCallFilter=~finit_module init_module delete_module

# Allow only necessary setuid/setgid operations (master needs these; workers don't)
# For more restrictive setup, consider separate service units for master and workers
SystemCallArchitectures=native

# Additional hardening
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=read-only
PrivateTmp=yes
PrivateDevices=yes
ProtectKernelTunables=yes
ProtectKernelModules=yes
ProtectControlGroups=yes
RestrictNamespaces=yes
RestrictRealtime=yes
LockPersonality=yes
MemoryDenyWriteExecute=yes

# Allow NGINX to read its config and serve files
ReadWritePaths=/var/log/nginx /var/cache/nginx /run/nginx
ReadOnlyPaths=/etc/nginx /usr/share/nginx /var/www

systemctl daemon-reload
systemctl restart nginx

# Verify the service has Seccomp active
systemctl status nginx | grep Seccomp
# Should show: SeccompFilter enabled

# Test that nginx still works
curl -I http://localhost/
# Expected: HTTP/1.1 200 OK

Step 3 — Restrict filesystem access

# /etc/systemd/system/nginx.service.d/filesystem-hardening.conf
[Service]

# Prevent NGINX workers from accessing home directories
ProtectHome=yes

# Read-only system except for writable paths
ProtectSystem=strict

# Explicit writable paths only
ReadWritePaths=/var/log/nginx /var/cache/nginx /run /tmp

# Prevent access to sensitive directories
InaccessiblePaths=/root /home /boot /proc/1

# Bind-mount only what nginx needs from /etc
BindReadOnlyPaths=/etc/nginx /etc/ssl/certs

Step 4 — Apply capability bounding set

Strip capabilities that NGINX workers don’t need post-startup:

# /etc/systemd/system/nginx.service.d/capabilities.conf
[Service]

# The master process needs NET_BIND_SERVICE to bind port 80/443
# Workers inherit a reduced capability set after the master forks them
CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_SETUID CAP_SETGID CAP_DAC_OVERRIDE

# Ambient capabilities — none needed after startup
AmbientCapabilities=

# Prevent any process from gaining new capabilities
NoNewPrivileges=yes

Step 5 — Write a targeted Seccomp BPF profile for NGINX

For higher-security deployments, replace systemd’s generic filter with a NGINX-specific BPF profile:

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "architectures": ["SCMP_ARCH_X86_64", "SCMP_ARCH_AARCH64"],
  "syscalls": [
    {
      "names": [
        "accept4", "bind", "close", "connect", "epoll_create1", "epoll_ctl",
        "epoll_wait", "eventfd2", "fstat", "futex", "getdents64", "getpid",
        "getuid", "geteuid", "getgid", "getegid", "ioctl", "listen",
        "lseek", "mmap", "mprotect", "munmap", "nanosleep", "open", "openat",
        "pipe2", "poll", "ppoll", "pread64", "pwrite64", "read", "readv",
        "recv", "recvfrom", "recvmsg", "rename", "rt_sigaction",
        "rt_sigprocmask", "rt_sigreturn", "send", "sendfile", "sendmsg",
        "sendto", "set_robust_list", "setsockopt", "getsockopt",
        "set_tid_address", "shutdown", "socket", "stat", "fstatat",
        "write", "writev", "exit", "exit_group", "clock_gettime",
        "gettimeofday", "getrlimit", "setrlimit", "prctl",
        "sched_getaffinity", "sched_yield", "unlink", "mkdir", "chmod",
        "chown", "utime", "utimensat"
      ],
      "action": "SCMP_ACT_ALLOW"
    }
  ]
}

Save as /etc/nginx/nginx-worker.seccomp.json and apply via systemd:

[Service]
SeccompFilter=/etc/nginx/nginx-worker.seccomp.json

Step 6 — Verify worker process isolation

# Check what the nginx worker can read
WORKER_PID=$(pgrep -f "nginx: worker" | head -1)
echo "Worker PID: $WORKER_PID"

# Verify worker runs as the expected user
cat /proc/$WORKER_PID/status | grep -E "^(Name|Uid|Gid|CapPrm|CapEff)"

# Check the worker's open files — should not include sensitive paths
ls -la /proc/$WORKER_PID/fd/ | grep -v "pipe\|socket\|nginx"

# Confirm Seccomp is active on the worker
cat /proc/$WORKER_PID/status | grep Seccomp
# Expected: Seccomp: 2 (filter active)

# Test: attempt to call a blocked syscall from inside nginx context
# This is a smoke test — not a full exploit test
cat /proc/$WORKER_PID/syscall  # Should show current syscall is within allowed set

Expected Behaviour

Control	Before hardening	After hardening
Worker calls `execve` to spawn shell	Succeeds	Blocked by Seccomp — `EPERM`
Worker reads `/root/.ssh/`	Succeeds if world-readable	Blocked by `InaccessiblePaths`
Worker makes outbound connection	Unrestricted	Allowed on port 80/443; other ports blocked by Seccomp
`/proc/$worker/status` shows Seccomp	`Seccomp: 0` (disabled)	`Seccomp: 2` (filter active)
Worker attempts `ptrace` on another process	Succeeds	Blocked by Seccomp
New capabilities after privilege drop	May be present	`NoNewPrivileges=yes` prevents

Trade-offs

Aspect	Benefit	Cost	Mitigation
Strict Seccomp filter	Blocks most LPE exploit chains	NGINX modules that need unusual syscalls will break	Audit each module’s syscall requirements; add exceptions with comments explaining why
`ProtectSystem=strict`	Prevents worker from writing to system paths	NGINX module configuration may write to unexpected paths	Map all legitimate write paths; add them to `ReadWritePaths`
`MemoryDenyWriteExecute`	Prevents ROP gadget injection	Some compression modules use JIT-compiled code	Disable only if a specific module requires it; document the exception
Separate seccomp for master vs. worker	Tighter worker restrictions	Complex to implement with systemd’s single-service model	Use `Type=forking` with custom startup wrapper if needed

Failure Modes

Failure	Symptom	Detection	Recovery
Seccomp blocks legitimate NGINX syscall	NGINX fails to start or serve requests; systemd shows SIGSYS	`dmesg` shows `audit: type=1326` (Seccomp violation); NGINX error log shows unexpected exit	Identify the blocked syscall via `strace nginx -t 2>&1 \| head -50`; add to allowlist
`ProtectHome` breaks serving files from home dirs	403 Forbidden for files under `/home/`	NGINX error log shows permission denied	Move served files out of home directories; use `/var/www`
`MemoryDenyWriteExecute` breaks Lua/njs module	Module fails to load; NGINX exits	NGINX error log shows memory mapping error	Add `MemoryDenyWriteExecute=no` and document why
`InaccessiblePaths` hides path NGINX legitimately needs	NGINX cannot find config file or cert	NGINX fails to start; `nginx -t` shows path error	Move the resource to a non-protected path or remove from `InaccessiblePaths`

Linux LPE Defence in Depth — the layered OS controls that contain exploitation even without a patch
Seccomp BPF Without Containers — applying Seccomp at the service level to non-containerised processes like NGINX
Systemd Unit Hardening — the full set of systemd security directives used in this article
NGINX Hardening Beyond TLS — application-layer NGINX hardening that complements OS-level controls
NGINX Fleet Patch Management — managing NGINX patches across the fleet while OS-level controls provide interim protection