Linux Kernel Crypto API Security: algif_aead Attack Surface and Safe Primitive Selection

Linux Kernel Crypto API Security: algif_aead Attack Surface and Safe Primitive Selection

The Problem

The Linux kernel ships a full cryptographic subsystem — symmetric ciphers, hash functions, RNGs, asymmetric operations, and AEAD constructions — used internally by IPsec, dm-crypt, WireGuard, TLS offloading, and the VFS layer. Since Linux 2.6.38, a subset of this subsystem has been exposed to userspace through AF_ALG sockets (address family 38). Applications can request hardware-accelerated AES-GCM or ChaCha20-Poly1305 from the kernel rather than implementing the cipher in userspace, letting them leverage AES-NI, SHA-NI, and vendor crypto accelerators without linking to OpenSSL.

The motivations are legitimate: key material stays in kernel address space rather than userspace heap, hardware acceleration is shared across applications, and FIPS 140-3 validated implementations in the kernel can be reused by userspace without recertifying each application. The implementation creates a kernel socket interface that accepts encryption parameters, processes arbitrary user-supplied data, and performs complex reference-counted resource management across concurrent send and receive operations. That combination has produced exploitable bugs repeatedly.

CVE-2021-3543 (assigned May 2021, CVSS 7.8 local) is a null pointer dereference in the algif_aead module. The bug lives in aead_sendmsg() in crypto/algif_aead.c. When a process calls sendmsg() on an AEAD operation socket after a prior recvmsg() call failed and left the socket’s scatter-gather list in a partially-allocated state, ctx->tsgl is null. The kernel then dereferences it unconditionally when computing the remaining data length. On kernels without SMAP, this was directly exploitable: a userspace allocation at virtual address 0 provided a controlled null page, turning the dereference into controlled kernel execution. On kernels with SMAP, the null dereference triggers an oops that can still produce a controlled crash usable in certain exploit chains. The bug affects Linux 5.11 and earlier unpatched kernels; stable backports landed in 5.11.17, 5.10.33, 5.4.115, and 4.19.190.

CVE-2019-8912 (assigned February 2019, CVSS 7.8 local) is a use-after-free in af_alg_release_parent() in crypto/af_alg.c. The AF_ALG socket model uses a two-socket design: a parent socket (sock) is bound to an algorithm name, then accept() on it produces a child operation socket (newsock) through which actual cryptographic operations are performed. The bug is a reference count race: af_alg_release_parent() decrements the parent’s reference count before checking whether the child socket is in an active operation, not after. If the parent socket is closed while a recvmsg() is in flight on the child — a timing window opened by slow crypto operations on large inputs — the parent socket structure is freed while the child’s in-flight operation still holds a pointer to it. Subsequent memory operations on the freed socket structure produce kernel heap corruption. This is directly exploitable for local privilege escalation. The patch, af_alg: fix race accessing crypto_alg, serialises the reference count check correctly; it was backported to 5.0.2, 4.20.15, 4.19.28, and 4.14.105.

Both bugs share a structural property: they arise from the intersection of complex reference counting, asynchronous crypto completion callbacks, and error path handling across a socket interface that was added incrementally over several kernel versions. The AF_ALG socket interface has the worst properties of both kernel networking code (complex state machines, reference counting across concurrent paths) and kernel crypto code (async completion, scatter-gather list management). Every async AEAD operation creates a request context that must be freed exactly once, by exactly the right code path, even when partial errors occur mid-operation.

Who can open AF_ALG sockets? This is where the attack surface is wider than most operators assume. Historically, socket(AF_ALG, SOCK_SEQPACKET, 0) required no capability at all — any unprivileged process could open one. Some distributions added CAP_NET_ADMIN as a prerequisite in patched kernels; others did not. In a container without a seccomp profile, a containerised process can open AF_ALG sockets against the host kernel regardless of the container’s UID, giving every unprivileged application process access to the full algif_aead attack surface. Kubernetes pods without a seccompProfile field in their securityContext run with the container runtime’s default seccomp profile (if any), which varies by CRI implementation and version.

The AEAD cryptographic context. AEAD (Authenticated Encryption with Associated Data) constructions — AES-GCM, ChaCha20-Poly1305, AES-CCM — provide both confidentiality and integrity in a single pass. The encryption produces a ciphertext and an authentication tag; decryption verifies the tag before returning plaintext. The security guarantee of AES-GCM collapses catastrophically when a (key, nonce) pair is reused: two ciphertexts encrypted under the same nonce allow recovery of the XOR of their plaintexts directly, and provide enough signal to forge authentication tags. This is not a theoretical concern. GCM nonce reuse is the most common AEAD misuse pattern in real code, and routing encryption through the kernel AF_ALG interface does not make the application immune to it — the kernel faithfully executes whatever nonce the application provides. The kernel does not enforce nonce uniqueness. It does not maintain a nonce counter. It encrypts whatever you send. Getting hardware acceleration via AF_ALG while reusing nonces is worse than using software AES-GCM correctly: you get the performance win and the cryptographic catastrophe simultaneously.

Threat Model

Local privilege escalation via algif_aead. An unprivileged local user (or a process in a container without seccomp) creates an AF_ALG socket, binds it to gcm(aes), and exploits a memory safety bug in algif_aead or af_alg through a crafted sequence of sendmsg/recvmsg calls with partial failures, concurrent closes, or malformed control messages. The null pointer dereference (CVE-2021-3543) and use-after-free (CVE-2019-8912) are both exploitable from this position. Outcome: kernel code execution, privilege escalation to root.

Container escape via kernel LPE. A Kubernetes pod without a seccomp profile, or with a seccomp profile that does not block socket(AF_ALG, ...), runs malicious or compromised code. The process opens AF_ALG sockets against the host kernel — not the container’s kernel, because Linux containers share the host kernel — and exploits algif_aead to obtain kernel code execution on the host. Outcome: full host compromise from an unprivileged containerised process.

AEAD nonce reuse in kernel-mediated encryption. An application uses AF_ALG to perform AES-GCM encryption for performance reasons but fails to manage nonces correctly — generating nonces from time(), using a counter that resets on process restart, or using a static nonce for a session key. The kernel encrypts each message with the same (key, nonce) pair. Outcome: full plaintext recovery for any two messages encrypted under the same nonce; authentication tag forgery; equivalent to no encryption at all for the affected messages.

Kernel module attack surface persistence. The algif_aead, algif_hash, and algif_skcipher modules are demand-loaded when an AF_ALG socket is opened with the corresponding algorithm type. On a system where these modules are not needed, they represent unnecessary kernel attack surface that is activated on demand by any unprivileged process. The modules remain loaded in kernel memory once loaded.

Hardening Configuration

1. Verify Patch Status

Confirm which CVEs apply to the running kernel. Vendor stable trees are what matters — upstream 5.12-rc8 is irrelevant if your system runs 5.10.x.

# Check running kernel version
uname -r

# CVE-2021-3543 fixed in: 5.12-rc8, 5.11.17, 5.10.33, 5.4.115, 4.19.190
# CVE-2019-8912 fixed in: 5.0.2, 4.20.15, 4.19.28, 4.14.105

# Ubuntu: check if the kernel security update has been applied
apt-cache policy linux-image-$(uname -r)
# The version string encodes the ABI — compare to Ubuntu's USN advisories.
# CVE-2021-3543 → USN-4946-1 (Ubuntu 20.04, kernel 5.4.0-73)
# CVE-2019-8912 → USN-3932-1 (Ubuntu 18.04, kernel 4.15.0-47)

# RHEL/CentOS: check the advisory
rpm -q kernel --qf '%{VERSION}-%{RELEASE}\n'
# CVE-2021-3543 → RHSA-2021:4356 (RHEL 8.5)

# Check whether algif_aead is currently loaded
lsmod | grep algif
# algif_aead             24576  0
# algif_skcipher         20480  0
# algif_hash             16384  0
# af_alg                 24576  3 algif_aead,algif_skcipher,algif_hash

# Verify which processes have open AF_ALG sockets
ss -f alg
# Clean system with no AF_ALG users returns only the header line:
# Netid  State   Recv-Q  Send-Q  Local Address:Port  Peer Address:Port  Process

For kernels compiled with CONFIG_KALLSYMS, verify the patched functions are present. The CVE-2021-3543 fix adds aead_wait_for_wmem():

grep "aead_wait_for_wmem\|aead_sendmsg" /proc/kallsyms 2>/dev/null
# Patched kernel shows both symbols; unpatched shows only aead_sendmsg

2. Seccomp: Block AF_ALG Socket Creation

This is the primary container-level control. Block socket() calls where the first argument is 38 (AF_ALG). The seccomp filter fires before any kernel crypto code executes, eliminating the attack surface entirely for containerised workloads.

{
  "defaultAction": "SCMP_ACT_ALLOW",
  "syscalls": [
    {
      "names": ["socket"],
      "action": "SCMP_ACT_ERRNO",
      "errnoRet": 22,
      "args": [
        {
          "index": 0,
          "value": 38,
          "op": "SCMP_CMP_EQ"
        }
      ]
    }
  ]
}

errnoRet: 22 returns EINVAL to the caller. Returning EACCES (13) or EPERM (1) are also common choices; EINVAL is what an unsupported address family returns on older kernels, making it harder to distinguish policy from capability.

Apply to a Docker container:

# Save the profile to /etc/docker/seccomp/no-af-alg.json
docker run \
  --security-opt seccomp=/etc/docker/seccomp/no-af-alg.json \
  --rm -it ubuntu:24.04 \
  python3 -c "
import socket
try:
    s = socket.socket(38, socket.SOCK_SEQPACKET, 0)
    print('AF_ALG socket created — NOT blocked')
except OSError as e:
    print(f'Blocked (errno {e.errno}): {e.strerror}')
"
# Expected output: Blocked (errno 22): Invalid argument

Apply as a Kubernetes SeccompProfile. In Kubernetes 1.19+, pod-level seccomp profiles are GA:

apiVersion: v1
kind: Pod
metadata:
  name: example-workload
spec:
  securityContext:
    seccompProfile:
      type: Localhost
      localhostProfile: no-af-alg.json
  containers:
  - name: app
    image: example:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 65534

The localhostProfile path is relative to the kubelet’s seccomp profile directory, which defaults to /var/lib/kubelet/seccomp/. Copy no-af-alg.json to /var/lib/kubelet/seccomp/no-af-alg.json on each node.

For cluster-wide enforcement, apply the profile via a MutatingAdmissionWebhook or use a policy engine (Kyverno, OPA Gatekeeper) to require a seccompProfile on every pod. Without this, developers can simply omit the field and get the runtime default.

3. Kernel Module Blacklisting (Prevent algif_aead Loading)

On hosts where AF_ALG is not needed at all — most application servers — prevent the modules from loading at boot. This eliminates the attack surface at the kernel level rather than the syscall level: even a seccomp bypass cannot reach code that is not loaded.

# Prevent algif_aead, algif_hash, algif_skcipher from loading
cat > /etc/modprobe.d/crypto-restrict.conf << 'EOF'
# Block userspace AF_ALG crypto modules.
# These expose kernel crypto primitives via AF_ALG sockets (address family 38).
# CVE-2021-3543 (null ptr deref) and CVE-2019-8912 (use-after-free) both
# required these modules. Remove if any application explicitly uses AF_ALG.
install algif_aead /bin/true
install algif_hash /bin/true
install algif_skcipher /bin/true
EOF

# Regenerate initramfs to apply at boot (Debian/Ubuntu)
update-initramfs -u -k all

# RHEL/Fedora
dracut --force

# Verify the module is blocked on the running kernel (does not require reboot
# if the module is not yet loaded — only prevents future loads):
modprobe algif_aead 2>&1
# modprobe: ERROR: could not insert 'algif_aead': Operation not permitted
# (or: no error if /bin/true silently blocks it — test with the socket check below)

# Verify the socket creation fails (even without seccomp):
lsmod | grep algif
# (no output — module not loaded)

python3 -c "
import socket
try:
    s = socket.socket(38, socket.SOCK_SEQPACKET, 0)
    print('AF_ALG socket created')
except OSError as e:
    print(f'Blocked: {e}')
"

Important limitation: install algif_aead /bin/true in modprobe.d prevents userspace-initiated loads via modprobe, but does not unload a module that is already in memory. If the system has been running since before the blacklist was added, check lsmod and either reboot or rmmod algif_aead manually (only safe if no processes have open AF_ALG sockets — check with ss -f alg first). The blacklist is authoritative at boot; a running kernel requires a reboot to enforce it against already-loaded modules.

4. Audit AF_ALG Usage Before Blocking

Blocking AF_ALG breaks any application that explicitly uses it. Identify these before applying controls. Most applications do not use AF_ALG — OpenSSL uses it only when compiled with enable-afalgeng and that engine is explicitly loaded. But verify.

# Active AF_ALG sockets on the running system
ss -f alg
# Output format:
# Netid State  Recv-Q Send-Q Local Address:Port Peer Address:Port Process
# alg   LISTEN 0      0      gcm(aes)           *                 pid=1234,fd=7

# All processes with open AF_ALG file descriptors
# (requires root or /proc/<pid>/fd access)
for pid in /proc/[0-9]*/fd; do
  proc="${pid%/fd}"
  if ls -la "$pid" 2>/dev/null | grep -q "AF_ALG\|socket:\[alg"; then
    echo "PID ${proc##*/proc/}: $(cat ${proc}/cmdline | tr '\0' ' ')"
  fi
done

# Alternatively, use lsof if installed:
lsof 2>/dev/null | grep "a_inode\|AF_ALG"

# Check if OpenSSL is configured to use the AF_ALG engine:
openssl engine
# 'afalg' in the output means OpenSSL will use AF_ALG sockets for hardware crypto.
# Absent from output = OpenSSL is using its own software implementation.
openssl engine -vvvv afalg 2>/dev/null && echo "afalg engine available"

# Enable auditd logging of AF_ALG socket creation (address family 38 = 0x26):
auditctl -a always,exit -F arch=b64 -S socket \
  -F a0=38 -k af_alg_socket_create

# Monitor for one hour, then review:
ausearch -k af_alg_socket_create --start recent | \
  grep -v "auid=-1" | \
  awk '{print $0}' | \
  grep "exe="

The auditctl rule logs every socket(AF_ALG, ...) call with the calling process’s pid, uid, and executable path. Run this for 24–48 hours before deploying the module blacklist. Any process that appears in the audit log needs assessment before the block goes in.

5. Safe AEAD Usage via AF_ALG (When You Must Use It)

If your application must use AF_ALG — typically because it requires hardware acceleration on a system where OpenSSL’s software fallback is too slow, or because the kernel’s validated FIPS implementation is a compliance requirement — the following covers the correct socket API usage and the critical nonce management constraint.

import socket
import os
import struct
import ctypes

# AF_ALG socket constants
AF_ALG = 38
SOL_ALG = 279          # getsockopt/setsockopt level for ALG sockets
ALG_SET_KEY = 1        # setsockopt optname: set cipher key
ALG_SET_IV = 2         # cmsg type: set IV/nonce for this operation
ALG_SET_OP = 3         # cmsg type: ALG_OP_ENCRYPT or ALG_OP_DECRYPT
ALG_SET_AEAD_AUTHSIZE = 4  # setsockopt optname: set authentication tag size
ALG_OP_ENCRYPT = 0
ALG_OP_DECRYPT = 1


def aes_gcm_encrypt_af_alg(
    key: bytes, plaintext: bytes, aad: bytes
) -> tuple[bytes, bytes, bytes]:
    """
    Encrypt with AES-256-GCM via AF_ALG.
    Returns (nonce, ciphertext, tag).

    Nonce management: generates a fresh random 96-bit nonce per call.
    NEVER pass a nonce argument from outside — callers cannot be trusted
    to maintain uniqueness across process restarts, replicas, or key reuse.
    """
    if len(key) not in (16, 24, 32):
        raise ValueError("AES key must be 16, 24, or 32 bytes; use 32 (AES-256)")
    if len(key) != 32:
        raise ValueError("Use AES-256 (32-byte key) — AES-128 is acceptable but prefer 256")

    TAG_SIZE = 16   # GCM authentication tag: always 16 bytes (128 bits)
    NONCE_SIZE = 12  # GCM nonce: 96 bits is the standard; other sizes require GHASH derivation

    # Cryptographically random nonce — safe for up to 2^32 messages per key
    # (birthday bound for 96-bit nonces). For higher volumes, rotate the key.
    nonce = os.urandom(NONCE_SIZE)

    # Parent socket: bound to algorithm, not used for data transfer
    parent = socket.socket(AF_ALG, socket.SOCK_SEQPACKET, 0)
    try:
        # bind() takes a 4-tuple: (type_str, feat, mask, name_str)
        # type: "aead" for AEAD constructions (vs "skcipher", "hash", "rng")
        # feat/mask: algorithm feature flags, 0 for standard usage
        # name: kernel crypto algorithm name — must match an entry in /proc/crypto
        parent.bind((b"aead", 0, 0, b"gcm(aes)"))

        # Set the key. Must be done before accept().
        parent.setsockopt(SOL_ALG, ALG_SET_KEY, key)

        # Set authentication tag size in bytes.
        # For GCM, valid range is 4–16. Always use 16 — truncated tags
        # are only acceptable in resource-constrained protocols with strict
        # message count limits (IoT, DTLS record layer).
        parent.setsockopt(SOL_ALG, ALG_SET_AEAD_AUTHSIZE, None,
                          ctypes.c_int(TAG_SIZE))

        # accept() creates the operation socket — actual data goes here
        op_sock, _ = parent.accept()
        try:
            # Build the IV cmsg: struct { __u32 ivtype; __u32 ivlen; __u8 iv[ivlen]; }
            # ALG_SET_IV = 2, ivlen = NONCE_SIZE
            iv_cmsg_data = struct.pack(f"II{NONCE_SIZE}s", ALG_SET_IV, NONCE_SIZE, nonce)

            # Build the op cmsg: struct { __u32 op; }
            # ALG_SET_OP = 3, value = ALG_OP_ENCRYPT = 0
            op_cmsg_data = struct.pack("I", ALG_OP_ENCRYPT)

            # For AEAD, AAD must precede plaintext in the sendmsg payload.
            # The kernel splits them based on the ALG_SET_AEAD_AUTHSIZE and
            # the total message length: first (len - aad_len - tag_size) bytes
            # are AAD... wait, that's wrong. The split is controlled by a cmsg.
            # Actually: AAD goes in the msg_iov, plaintext follows, and the
            # AAD length is set via ALG_SET_AEAD_AUTHSIZE... no.
            #
            # Correct API: send AAD + plaintext concatenated. The kernel uses
            # the aead_request's assoclen field to split them. assoclen is set
            # via the ALG_SET_AEAD_AUTH cmsg (type 6, not 4 — 4 is the key size).
            # This is the most confusing part of the algif_aead API.

            AAD_LEN_CMSG_TYPE = 6   # ALG_SET_AEAD_AUTHSIZE as cmsg (not setsockopt)
            aad_len_cmsg_data = struct.pack("I", len(aad))

            message = aad + plaintext

            # sendmsg: data payload + three control messages
            op_sock.sendmsg(
                [message],
                [
                    (SOL_ALG, ALG_SET_IV, iv_cmsg_data),
                    (SOL_ALG, ALG_SET_OP, op_cmsg_data),
                    (SOL_ALG, AAD_LEN_CMSG_TYPE, aad_len_cmsg_data),
                ]
            )

            # Receive: kernel returns ciphertext + tag
            # AAD is NOT included in the output — only ciphertext and tag
            result = op_sock.recv(len(plaintext) + TAG_SIZE)

            if len(result) != len(plaintext) + TAG_SIZE:
                raise ValueError(
                    f"Expected {len(plaintext) + TAG_SIZE} bytes, got {len(result)}"
                )

            ciphertext = result[:len(plaintext)]
            tag = result[len(plaintext):]

            return nonce, ciphertext, tag

        finally:
            op_sock.close()
    finally:
        parent.close()

The most error-prone part of the algif_aead API is the AAD length signalling. The kernel’s aead_sendmsg() uses the assoclen field from the per-operation control message (type 6, ALG_SET_AEAD_AUTH) to determine where associated data ends and plaintext begins. Confusingly, ALG_SET_AEAD_AUTHSIZE (the setsockopt optname value 4) sets the tag size, not the AAD length — these are distinct parameters set through two different mechanisms. Getting them transposed produces either a kernel error (EINVAL) or silent data corruption where the cipher processes the wrong bytes as AAD.

6. Prefer Userspace Crypto Over AF_ALG

For the vast majority of applications, the correct answer is to not use AF_ALG at all. OpenSSL’s software AES-GCM implementation with AES-NI hardware support runs at 2–8 GB/s on modern hardware — faster than most applications can produce or consume data. The kernel AF_ALG interface adds at minimum two memory copies (plaintext to kernel, ciphertext back to userspace) and syscall overhead per operation. For any payload under approximately 4KB, the syscall overhead dominates and AF_ALG is slower than software AES-NI.

from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from cryptography.hazmat.primitives.ciphers.aead import ChaCha20Poly1305
import os


def aes_gcm_encrypt(
    key: bytes, plaintext: bytes, aad: bytes
) -> tuple[bytes, bytes]:
    """
    Encrypt with AES-256-GCM via the cryptography library (OpenSSL backend).

    OpenSSL uses AES-NI for hardware acceleration automatically — no AF_ALG
    socket needed. Key material stays in process address space (acceptable
    for most threat models; use AF_ALG only if kernel-side key storage is
    a hard requirement).

    Returns (nonce, ciphertext_with_appended_tag).
    The cryptography library appends the 16-byte GCM tag to the ciphertext.
    """
    assert len(key) == 32, "Use AES-256 (32-byte key)"

    # Random 96-bit nonce — safe for up to 2^32 encryptions per key
    nonce = os.urandom(12)

    aesgcm = AESGCM(key)
    # encrypt() returns ciphertext || tag (tag is appended, 16 bytes)
    ciphertext_with_tag = aesgcm.encrypt(nonce, plaintext, aad)

    return nonce, ciphertext_with_tag


def aes_gcm_decrypt(
    key: bytes, nonce: bytes, ciphertext_with_tag: bytes, aad: bytes
) -> bytes:
    """
    Decrypt and authenticate. Raises InvalidTag on authentication failure.
    The exception must be caught — do NOT use the plaintext if decryption raises.
    """
    aesgcm = AESGCM(key)
    # Raises cryptography.exceptions.InvalidTag if authentication fails.
    # Never suppress this exception. Never use the plaintext on failure.
    return aesgcm.decrypt(nonce, ciphertext_with_tag, aad)


def chacha20_poly1305_encrypt(
    key: bytes, plaintext: bytes, aad: bytes
) -> tuple[bytes, bytes]:
    """
    ChaCha20-Poly1305 for environments without AES-NI (ARM without crypto
    extensions, older AMD, constrained devices). Software performance is
    comparable to hardware-accelerated AES-GCM. No nonce reuse sensitivity
    differs from AES-GCM — both are catastrophic.
    """
    assert len(key) == 32
    nonce = os.urandom(12)

    chacha = ChaCha20Poly1305(key)
    return nonce, chacha.encrypt(nonce, plaintext, aad)

When to use AF_ALG despite the above:

  • The application operates within a FIPS 140-3 boundary that requires the kernel’s validated crypto implementation. OpenSSL’s FIPS module is a separate validated implementation; check your compliance requirement before assuming AF_ALG is necessary.
  • The application stores long-lived key material and the threat model includes userspace memory scanning (e.g., process memory dumps, /proc/<pid>/mem reads by a co-resident privileged process). Kernel-side key storage via ALG_SET_KEY keeps the key outside userspace heap.
  • Bulk encryption of very large payloads (multi-GB) using hardware crypto accelerators that are not available to OpenSSL — specific server-class NICs with crypto offload engines.

For all other cases, the userspace cryptography library is the correct choice: simpler API, no kernel attack surface, equivalent or superior performance on modern hardware.

Expected Behaviour After Hardening

After seccomp blocks socket(AF_ALG, ...):

$ python3 -c "import socket; socket.socket(38, socket.SOCK_SEQPACKET, 0)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
OSError: [Errno 22] Invalid argument

The error is EINVAL (errno 22), not EPERM — matching the errnoRet: 22 in the seccomp profile. Applications that call socket(AF_ALG, ...) and catch this error gracefully will fall back to their software implementation. Applications that do not handle the error will crash. The audit step in section 4 identifies which scenario applies before the block goes in.

After module blacklisting with install algif_aead /bin/true and a reboot:

$ lsmod | grep algif
(no output)

$ modprobe algif_aead
# (exits silently — /bin/true was run instead of modprobe)

$ python3 -c "import socket; socket.socket(38, socket.SOCK_SEQPACKET, 0)"
OSError: [Errno 19] No such device
# ENODEV (19) rather than EINVAL — the module that handles AF_ALG is absent

After the auditd rule and no AF_ALG users are present:

$ ausearch -k af_alg_socket_create
<no matches>

After ss -f alg with no AF_ALG sockets open:

Netid  State   Recv-Q  Send-Q  Local Address:Port  Peer Address:Port  Process

(Header only, no socket entries.)

Trade-offs

Seccomp block vs. module blacklist. Seccomp operates per-process via a filter loaded at container or process start; it can be applied selectively to containerised workloads without affecting the host. Module blacklisting is a host-wide control that requires a reboot to take effect and cannot be scoped to specific processes. The two are complementary: blacklisting prevents the module from existing in kernel memory at all, while seccomp prevents specific processes from reaching it if blacklisting is not possible (e.g., on a host that runs a legitimate dm-crypt or WireGuard workload that touches the kernel crypto subsystem).

Blocking AF_ALG vs. application compatibility. The afalg OpenSSL engine is compiled in and enabled by default on some Linux distributions (Fedora, older Ubuntu builds). Applications that use openssl engine or ENGINE_by_id("afalg", ...) explicitly will fail with the block in place. The openssl engine command will tell you whether the engine is available and active. Systems running cryptsetup/dm-crypt do not use AF_ALG — dm-crypt uses the crypto_alloc_aead() kernel-internal API, not the AF_ALG socket interface. WireGuard similarly uses kernel-internal crypto APIs.

Userspace crypto library vs. AF_ALG. The syscall and memory copy overhead of AF_ALG makes it slower than AES-NI-backed software crypto for payloads under roughly 4KB — the dominant payload size range for most application protocols. For very large payloads on hardware with dedicated crypto engines (CAVS, QuickAssist), AF_ALG can outperform software. Measure before assuming AF_ALG is faster; it frequently is not.

Random nonce vs. counter nonce. The code above uses os.urandom(12) for nonce generation. This is safe for up to approximately 2^32 messages per key (birthday bound: 50% probability of collision at ~4 billion messages with a 96-bit random nonce). For applications that encrypt more than a billion messages under a single key, switch to a counter-based nonce scheme with strict state persistence — but only if you can guarantee that the counter state survives process restarts and is never shared across replicas without coordination. If you cannot make that guarantee, rotate the key more frequently instead.

Failure Modes

Assuming the kernel patch is sufficient. CVE-2021-3543 and CVE-2019-8912 are patched in stable backport series. But Linux distributions sometimes ship kernels that lag several weeks behind stable, and container runtimes (older containerd, CRI-O versions) may load algif_aead into the host kernel namespace in ways that expose patched-but-still-complex attack surfaces. Patching eliminates these specific bugs but does not eliminate the class of bug — the algif_aead code continues to receive CVEs. Treat patch verification as necessary but not sufficient.

Blocking AF_ALG without auditing first. The seccomp or module blacklist goes in without the auditd step, and an application that uses afalg for encrypted database writes begins returning I/O errors. The error manifests as data corruption or silent write failures, not as an obvious crypto error. Run the audit step for at least 24 hours of representative traffic before deploying any block.

Nonce reuse in AF_ALG-mediated AES-GCM. A service uses AF_ALG for hardware-accelerated AES-GCM encryption of session data, generating nonces with random.randint(0, 2**96) seeded from time.time(). After a process restart under load, two workers start with the same timestamp seed and generate identical nonces for different messages under the same key. The authentication tag for both messages is now forgeable, and XOR of the two ciphertexts reveals XOR of the plaintexts. The kernel’s AF_ALG interface does not detect or prevent this — it encrypts what you send. Hardware acceleration of a broken protocol is faster failure.

Treating SCMP_ACT_ERRNO as a kill. The seccomp profile returns EINVAL rather than killing the process (SCMP_ACT_KILL). An application that catches OSError broadly and logs it will continue running without hardware crypto, which may or may not be correct depending on whether the fallback is to software crypto or to no crypto at all. Audit the application’s error handling for socket(AF_ALG, ...) before relying on silent fallback.

Container runtime default seccomp profiles. Docker’s default seccomp profile blocks a significant number of syscalls but, as of Docker 25.x, does not block socket(AF_ALG, ...) by default. Kubernetes with containerd does not apply any seccomp profile unless seccompProfile is explicitly set in the pod spec. Do not assume the runtime default is sufficient — it is not, for this attack surface.

Checking only for algif_aead and missing related modules. The algif_hash and algif_skcipher modules expose hash functions and block ciphers respectively via AF_ALG. CVE-2019-8912 affects the core af_alg infrastructure, not just the AEAD module. Blocking algif_aead without blocking algif_hash and algif_skcipher leaves the af_alg_release_parent() race condition reachable via the hash and skcipher interfaces on unpatched kernels.