Articles

Every article follows the same structure: Problem, Threat Model, Configuration, Expected Behaviour, Trade-offs, and Failure Modes. No fluff.

Cross-Cutting Guides

intermediate 14 min read

API Key Lifecycle at Scale: Issuance, Rotation, Scoping, and Audit Across Cloud and SaaS

API keys are the most-leaked credential type. Treating their lifecycle as a tracked property — issued, scoped, rotated, revoked — is the difference between hygiene and incident.

intermediate 14 min read

Production Access Management with Teleport and Boundary: Brokered, Recorded, Auditable Access

Static SSH keys + bastion hosts is the 1990s model. Teleport / Boundary broker access dynamically, record sessions, and integrate with identity. The 2026 default.

intermediate 14 min read

Tabletop Exercises and Chaos Security Drills: Building, Running, and Acting on Findings

Tabletops without follow-through are theatre. Chaos security drills make findings unavoidable. Both, run together, build organizational muscle for real incidents.

advanced 16 min read

Secrets Rotation Orchestration: Coordinating Vault, KMS, OIDC, and Database Credentials

Rotation isn't just minting a new secret. It's a sequenced operation across producers, consumers, and stale-credential drains. Most outages happen during rotation.

advanced 17 min read

SPIFFE and SPIRE for Workload Identity Across Clusters and Clouds

Cryptographic workload identity that survives across Kubernetes clusters, cloud accounts, and on-prem hosts. SPIFFE replaces shared secrets with attestation.

intermediate 16 min read

Threat Modeling at Scale: STRIDE-per-Component, PASTA, and Continuous Threat Modeling

Threat modeling does not scale by adding more whiteboard sessions. Codify the methodology, embed in design review, and treat threat models like code.

advanced 18 min read

Post-Quantum Crypto Migration Plan: Hybrid TLS, SSH, Code Signing, and Encryption at Rest

NIST finalized ML-KEM and ML-DSA in 2024. Harvest-now-decrypt-later is already happening. A migration plan that covers TLS, SSH, artifact signing, and secrets is now tractable.

advanced 24 min read

Identity Abuse and Credential Compromise: Defending Against Attackers Who Log In Instead of Break In

Nearly 80% of intrusion detections in 2026 are malware-free. Attackers steal valid credentials, hijack session tokens, exploit federated access, and bypass weak MFA to move laterally without triggering traditional malware detection. This article covers the defensive controls for identity-based attacks.

advanced 26 min read

Ransomware 3.0 and Multi-Stage Extortion: Defence, Detection, and Recovery

Ransomware has evolved from simple encryption to multi-stage extortion: data theft, encryption, public exposure threats, and DDoS. Ransomware-as-a-Service groups operate with dedicated negotiation teams and support desks. This article covers the defensive architecture that reduces blast radius, detects early-stage ransomware behaviour, and enables recovery without paying.

intermediate 14 min read

The Hardening Scorecard: Measuring and Tracking Security Posture

"Are we more secure than last month?" is a question most teams cannot answer. Security tools produce individual outputs: kube-bench returns a CIS score...

intermediate 16 min read

Compliance-as-Code: Mapping CIS Benchmarks to Automated Checks with InSpec and Kube-bench

Manual compliance audits are point-in-time snapshots that are outdated before the report is written.

intermediate 20 min read

Hardening PostgreSQL for Production: Authentication, Encryption, Row-Level Security, and Audit Logging

PostgreSQL defaults prioritise developer convenience over security. A stock installation on most distributions allows local trust authentication (any.

advanced 35 min read

Hardening a Complete Kubernetes Platform: From Cluster Bootstrap to Production-Ready

A fresh Kubernetes cluster (whether bootstrapped with kubeadm, k3s, or provisioned by a managed provider) ships with defaults optimised for getting...

intermediate 15 min read

Incident Response Hardening Playbook: From Detection to Post-Mortem

During an active security incident, hardening is reactive: isolate the compromised system, contain the blast radius, preserve evidence, and stop the..

advanced 15 min read

Security Infrastructure Disaster Recovery: Vault, PKI, and SIEM Failover

When your security infrastructure fails, you are flying blind. If Vault is down, applications cannot retrieve secrets and new deployments stall.

intermediate 16 min read

Migrating from Self-Hosted Prometheus to Grafana Cloud: Preserving Dashboards, Alerts, and History

Self-hosted Prometheus consumes 500GB+ storage within 6 months for a 20-node Kubernetes cluster.

intermediate 18 min read

Securing Message Queues in Production: Kafka, RabbitMQ, and NATS Hardening

Message brokers carry some of the most sensitive data in any architecture, payment events, user actions, system commands, PII in event streams.

advanced 15 min read

Multi-Cloud Hardening: Consistent Security Posture Across Providers

Running infrastructure across multiple cloud providers means maintaining consistent security controls across fundamentally different systems.

advanced 16 min read

Zero Trust Networking: Identity-Based Access Beyond Perimeter Security

Perimeter security assumes the internal network is safe. It is not. A single compromised pod, a stolen VPN credential, or a malicious insider gives...

beginner 18 min read

Security Hardening for Small Teams: Prioritising Controls When You Cannot Do Everything

A team of 1-5 engineers cannot implement 100 hardening controls simultaneously. Most hardening guides present controls as equally important, leaving...

advanced 22 min read

Migrating from Self-Managed Kubernetes to a Managed Provider Without Losing Your Security Posture

Self-managed Kubernetes clusters (kubeadm, k3s, kops) consume 8-16 hours per month of engineering time for control plane maintenance: etcd backups,...

intermediate 14 min read

Hardening Redis in Production: Authentication, TLS, ACLs, and Command Restriction

Redis defaults prioritise developer convenience: no authentication, no TLS, all 200+ commands available, and binding to all interfaces.

Kubernetes / Platform

advanced 14 min read

CSI Driver Security: Volume-Mount Hardening, Privileged Drivers, and Inline Ephemeral Volumes

CSI drivers run with broad privileges by design. Their security posture often goes unaudited — until one is the exfil path or the privilege-escalation step.

intermediate 13 min read

External Secrets Operator: Pulling Secrets from KMS, Vault, and Cloud Stores into Kubernetes

Native Kubernetes Secrets are visible to anyone with namespace get. External Secrets Operator pulls from your real secret store on schedule, with rotation and audit.

intermediate 13 min read

Native Sidecar Containers in Kubernetes 1.29+: Lifecycle, Security, and Mesh Migration

restartPolicy: Always init containers GA'd in 1.29 fix the long-standing init/main race. Bigger security wins for service-mesh and log-shipper deployments.

advanced 16 min read

Confidential Containers on Kubernetes: AMD SEV-SNP, Intel TDX, and the Attestation Flow

Confidential Containers move workload isolation from the kernel to the silicon. Encrypted memory, hardware-attested boot, and a different threat model than user namespaces.

advanced 14 min read

User Namespaces for Pods: UID Remapping, Container Escape Defense, and the GA Path in Kubernetes 1.30+

userns: true remaps Pod UIDs into a per-Pod range. A container running as root sees uid 0 inside; the host sees an unprivileged user. Big hardening win, easy to enable.

intermediate 15 min read

ValidatingAdmissionPolicy with CEL: Native Kubernetes Admission Without Webhooks

VAP replaces webhook admission for the policies you write most often. No Kyverno, no OPA, no network round-trip, no webhook availability risk.

intermediate 17 min read

Gateway API Security Patterns: Multi-Team Routing, ReferenceGrant, and Delegated Trust on Kubernetes

Gateway API replaces Ingress with a multi-role model that separates infrastructure, cluster operator, and application developer concerns. New surface, new threat model.

advanced 26 min read

LLMs on Kubernetes: Understanding the Threat Model and Deploying an LLM Gateway

Kubernetes orchestrates LLM workloads but has no awareness of what those workloads do. An Ollama pod with healthy readiness probes and stable resource usage can still leak secrets, execute prompt injection, and grant models excessive agency over internal services. This article covers the LLM-specific threat model for Kubernetes and implements an LLM gateway as the policy enforcement layer.

intermediate 22 min read

Kubernetes Node Hardening: From OS Configuration to kubelet Lockdown

A Kubernetes node is a Linux machine running kubelet, a container runtime, and your workloads.

advanced 16 min read

GPU Workload Isolation: MIG, MPS, and vGPU Security Boundaries

Multi-tenant GPU sharing without isolation risks data leakage between workloads through shared GPU memory.

intermediate 13 min read

GPU Cost and Security Monitoring: Detecting Abuse and Optimising Spend

GPU compute costs between $2 and $30 per hour per device. A single unauthorised cryptocurrency mining pod running on an A100 for a weekend generates..

intermediate 14 min read

LLM Rate Limiting in Production: Token Budgets, Per-User Quotas, and Abuse Detection

Request-count rate limiting fails for LLM workloads because a single request can consume 100K tokens. Token-based rate limiting with per-user quotas and abuse detection prevents runaway costs and catches prompt injection probing before it escalates.

advanced 22 min read

Runtime Security with Falco on Kubernetes: Rules, Tuning, and Response Automation

Prevention-only security has a binary failure mode: either the control holds and the attacker is stopped, or the control fails and the attacker...

intermediate 22 min read

Kubernetes Network Policies That Actually Work: From Default Deny to Microsegmentation

By default, every pod in a Kubernetes cluster can communicate with every other pod across all namespaces. There are no network boundaries.

intermediate 15 min read

LLM Cost Controls: Budget Enforcement, Token Metering, and Spend Alerting

Without enforced budgets, a single team can exhaust an organization's entire AI spend in days. Token metering with per-team budgets, automatic request rejection at limits, model routing by cost, and chargeback dashboards turn LLM spending from a surprise into a managed line item.

intermediate 18 min read

Kubelet Security Configuration: Authentication, Authorization, and Read-Only Port

The kubelet runs on every node in the cluster with root-level access to the container runtime, all pod specifications, mounted secrets, and the host..

intermediate 20 min read

Kubernetes RBAC Design Patterns: Least Privilege Without Paralysing Developers

RBAC sprawl in multi-team Kubernetes clusters grows past 100 role bindings within months.

intermediate 20 min read

Kubernetes Secrets Management: External Secrets Operator, Vault, and Sealed Secrets

Kubernetes Secrets are base64-encoded, not encrypted. Anyone with RBAC read access to secrets in a namespace can decode every credential stored there.

advanced 18 min read

AI Incident Forensics: Reconstructing What an AI System Did, Why, and What Data It Accessed

When a traditional application causes an incident, you examine logs, traces, and database queries to reconstruct what happened.

intermediate 16 min read

Hardening Model Inference Endpoints: Authentication, Rate Limiting, and Input Validation

Model inference endpoints are GPU-backed and expensive, $2-30 per hour per GPU. A single unprotected endpoint exposed to the internet can accumulate..

intermediate 22 min read

Kubernetes Admission Control: From PodSecurity Standards to Custom OPA/Kyverno Policies

Without admission control, any user with deployment permissions can run privileged containers, mount the host filesystem, use the host network, run...

advanced 16 min read

AI Data Leakage Prevention: Input Filtering, Output Scanning, and Audit Trails

AI systems leak data in ways traditional applications do not. A language model trained on customer data can reproduce verbatim customer records in...

intermediate 14 min read

Jupyter Notebook Security: Authentication, Isolation, and Data Protection

JupyterHub is a code execution platform. Every notebook cell is arbitrary code running with whatever permissions the notebook server process has.

intermediate 20 min read

Multi-Tenancy Hardening in Kubernetes: Namespace Isolation, Resource Quotas, and Network Boundaries

Kubernetes namespaces provide logical separation, not security isolation. By default, pods in namespace A can send network traffic to pods in...

advanced 17 min read

Building a Content Filtering Pipeline for LLM Applications: From Raw Input to Safe Output

A single content filter is not a pipeline. Most LLM deployments add one filter (usually on output) and call it done.

advanced 17 min read

AI Red Teaming Methodology: Structured Adversarial Testing for LLM Applications

Traditional security testing (penetration testing, vulnerability scanning) does not cover AI-specific attack surfaces.

intermediate 20 min read

Kubernetes Image Policy Enforcement: Cosign, Notation, and Admission Webhooks

Without image policy enforcement, any container image from any registry can run in a Kubernetes cluster.

advanced 16 min read

Securing RAG Pipelines: Vector Database Access Control, Document Poisoning, and Retrieval Filtering

Retrieval-Augmented Generation (RAG) adds a knowledge base to LLM applications, the model retrieves relevant documents before generating a response.

intermediate 20 min read

Pod Security Context Deep Dive: runAsNonRoot, readOnlyRootFilesystem, and Capabilities

Kubernetes SecurityContext has over 15 configurable fields, but most teams only set runAsNonRoot: true and consider the job done.

intermediate 18 min read

Vector Database Security: Access Control, Embedding Protection, and Query Isolation

Vector databases are the backbone of RAG (Retrieval-Augmented Generation) systems.

intermediate 17 min read

A/B Model Deployment Safety: Canary Rollouts, Traffic Splitting, and Automated Rollback for ML Models

Deploying a new ML model version is not the same as deploying a new application version.

intermediate 22 min read

Kubernetes API Server Hardening: Flags, Authentication, and Audit Logging

The API server is the front door to the Kubernetes cluster. Every kubectl command, every controller reconciliation, every pod scheduling decision,...

intermediate 20 min read

Seccomp Profiles for Production Workloads: Writing, Testing, and Deploying Custom Profiles

The default container runtime allows approximately 300 syscalls. A compromised container can use unshare to create new namespaces, clone to spawn...

intermediate 18 min read

etcd Encryption at Rest: Configuration, Key Rotation, and Performance Impact

Kubernetes Secrets are stored in etcd as base64-encoded plaintext. Base64 is an encoding, not encryption.

advanced 18 min read

Implementing AI Guardrails: Input Validation, Output Filtering, and Safety Classifiers in Production

Deploying an LLM without guardrails is deploying an application where any user can make it say or do anything.

intermediate 21 min read

Hardening Kubernetes Ingress Controllers: NGINX, Traefik, and Envoy Compared

The ingress controller is the internet-facing entry point to a Kubernetes cluster.

advanced 18 min read

LLM Observability in Production: Monitoring Latency, Token Usage, Safety Violations, and Drift

Traditional application monitoring (CPU, memory, HTTP status codes, latency) tells you nothing about what an LLM is doing.

intermediate 16 min read

Hardening Model Serving Frameworks: TorchServe, Triton, and vLLM Security Configuration

Model serving frameworks ship with defaults optimised for development: management APIs exposed on all interfaces without authentication, model files..

advanced 18 min read

Securing Fine-Tuning Pipelines: Data Isolation, Checkpoint Integrity, and Access Control

Fine-tuning pipelines are high-value targets. They consume expensive GPU hours, process proprietary training data, and produce model checkpoints that...

intermediate 18 min read

Hardening the Kubernetes Scheduler: Topology Constraints and Security-Aware Placement

The Kubernetes scheduler places pods on nodes based on resource availability and basic constraints.

intermediate 22 min read

Kubernetes Audit Log Analysis: What to Log, How to Query, and What to Alert On

Kubernetes audit logs record every request to the API server: who made the request, what they asked for, and whether it succeeded.

advanced 14 min read

Securing Model Artifact Pipelines: From Training to Serving

Model files are opaque binaries ranging from 1GB to over 1TB. You cannot code-review a set of weights.

advanced 17 min read

RLHF Data Protection: Securing Human Feedback Loops, Preference Data, and Reward Models

Reinforcement Learning from Human Feedback (RLHF) pipelines introduce unique security surfaces that standard ML training workflows do not have.

intermediate 13 min read

AI API Key Management: Rotation, Scoping, and Abuse Detection

AI services have turned API keys into direct spending controls. A leaked OpenAI or Anthropic key can generate thousands of dollars in charges within...

advanced 16 min read

Prompt Injection Defence in Production: Input Validation, Output Filtering, and Monitoring

Prompt injection is the SQL injection of AI systems, the most common and most damaging attack class against LLM-powered applications.

advanced 15 min read

Network Segmentation for AI Training Infrastructure

AI training clusters frequently share networks with production services. A training job that can reach the production database is one compromised...

intermediate 14 min read

Observability for LLM Applications: Token Usage, Latency Anomalies, and Output Classification

LLM-powered applications have unique observability requirements that standard APM tools do not address: token-based cost tracking (not just request...

intermediate 16 min read

Model Registry Access Control: Versioning, Signing, and Promotion Gates

Model registries are the bridge between training and production. A model pushed to the production registry gets served to users.

intermediate 19 min read

Kubernetes Service Account Token Security: Bound Tokens, Projected Volumes, and OIDC

Every pod in Kubernetes receives a service account token by default. In clusters running older configurations or without explicit hardening, these...

Linux / OS Hardening

advanced 14 min read

dm-verity and dm-integrity: Tamper-Evident Block-Level Roots for Production Linux

dm-verity gives you a read-only root that fails to mount if a single block is tampered with. dm-integrity adds runtime checksumming. Together: immutable, evidence-bearing systems.

advanced 14 min read

eBPF-LSM (lsm_bpf): Kernel Security Policy as Hot-Loadable BPF Programs

lsm_bpf attaches eBPF programs to LSM hooks. Define security policy in code, push without reboot, audit at the syscall boundary. AppArmor for cloud-native systems.

intermediate 13 min read

USBGuard: USB Device Authorization on Production Linux Hosts

USB devices are a peripheral attack surface most servers ignore. USBGuard provides allowlist-based authorization, blocking BadUSB and malicious-cable threats.

intermediate 13 min read

FIDO2 SSH with sk-* Keys: Hardware-Backed Authentication for Production Hosts

ed25519-sk and ecdsa-sk bind SSH keys to a hardware token. Phishing-resistant, exfiltration-proof, increasingly the default. Two short commands to switch.

intermediate 14 min read

Kernel Lockdown Mode: Blocking Root from Modifying the Running Kernel

Lockdown mode separates root from kernel. integrity blocks code modification; confidentiality also blocks reads. Cheap, broad, underused.

advanced 16 min read

Landlock LSM: Unprivileged Kernel Sandboxing for Production Linux Applications

Landlock lets an unprivileged process restrict its own filesystem and network access at the kernel level. AppArmor without root, seccomp with semantics.

advanced 16 min read

io_uring Security and Hardening: Disabling, Restricting, and Auditing a Bypass-Prone Syscall Interface

io_uring gives userspace a submission queue that sidesteps the normal syscall path. It has produced a steady stream of kernel CVEs and routinely bypasses seccomp.

intermediate 24 min read

Secure Cloud VM Access: SSH Key Authentication, Two-Factor Login, VPN, and Audit Logging

Cloud VMs exposed to the internet with password-only SSH are compromised within hours. This article covers the complete secure access stack: SSH key authentication, TOTP two-factor login, WireGuard VPN as a network-layer gate, and audit logging to track who did what and when.

intermediate 20 min read

SSH Hardening Beyond the Basics: Certificate Authentication, Jump Hosts, and Logging

Every SSH hardening guide starts and ends with the same three changes: disable root login, require key-based authentication, change the default port.

intermediate 15 min read

Hardening DNS Resolution on Linux: systemd-resolved, Unbound, and DNS-over-TLS

Most Linux hosts resolve DNS in plaintext over UDP port 53. On a stock Ubuntu 24.04 or RHEL 9 system:

intermediate 18 min read

Hardening the Linux Kernel Attack Surface with sysctl and Boot Parameters

Linux kernels ship with defaults optimised for compatibility, not security. On a stock Ubuntu 24.04 or RHEL 9 installation.

advanced 14 min read

Hardening GRUB and the Boot Process: Secure Boot, Boot Passwords, and Tamper Detection

Without boot security, an attacker with physical access or console access (BMC, IPMI, cloud serial console) to a Linux system can.

intermediate 13 min read

Hardening /proc and /sys: Restricting Kernel Information Disclosure

/proc and /sys are virtual filesystems that expose kernel internals, hardware details, and process information to userspace.

intermediate 16 min read

Linux Audit Framework Deep Dive: auditd Rules, auditctl, and ausearch for Security Monitoring

auditd is the kernel-level audit system on Linux, it captures syscalls, file access, user commands, and privilege changes that no userspace tool can...

intermediate 16 min read

Linux Firewall Hardening with nftables: Replacing iptables in Production

iptables is deprecated. nftables is the replacement in every modern Linux kernel (5.0+).

intermediate 15 min read

Cgroup v2 Resource Isolation: Preventing Resource Exhaustion Attacks on Shared Systems

Without resource limits, a single service, container, or compromised process can consume all available CPU, memory, I/O bandwidth, or PIDs on a host.

advanced 18 min read

SELinux in Production: Writing Custom Policies Without Losing Your Mind

SELinux is the most powerful mandatory access control system on Linux, and the most disabled. The result: services have no MAC confinement.

intermediate 14 min read

Time Synchronization Security: Hardening NTP and Chrony Against Manipulation

Accurate time is a silent dependency of almost every security control on a Linux system.

intermediate 22 min read

Automated OS Hardening with Ansible: A Production-Ready Playbook Collection

Manual OS hardening does not scale. The sysctl settings from Hardening the Linux Kernel Attack Surface with sysctl and Boot...

intermediate 14 min read

PAM Configuration Hardening: Password Policies, Login Controls, and MFA Integration

PAM (Pluggable Authentication Modules) is the authentication foundation on Linux.

intermediate 13 min read

Kernel Module Hardening: Blacklisting, Signing, and Preventing Runtime Loading

The Linux kernel loads modules on demand. When a process requests a capability that is not built into the running kernel (a filesystem type, a...

intermediate 16 min read

Hardening Container Base Images: From ubuntu:latest to a Minimal, Signed, Scannable Image

ubuntu:latest ships with over 200 packages. At any given point, a vulnerability scan with Trivy will report 50 or more CVEs, most of which are in...

intermediate 14 min read

AppArmor Profiles for Custom Applications: From Complain Mode to Enforce

AppArmor is the default mandatory access control system on Ubuntu and Debian. It restricts applications to specific file paths, capabilities, and...

intermediate 20 min read

systemd Unit Hardening: ProtectSystem, PrivateTmp, and the Full Sandbox Toolkit

systemd provides over 30 security-relevant directives for sandboxing services, yet the vast majority of unit files (including those shipped by...

intermediate 14 min read

Filesystem Mount Options That Matter: noexec, nosuid, nodev, and Beyond

Default Linux installations mount most filesystems with permissive options. On a stock Ubuntu 24.04 or RHEL 9 system:

Network & API Security

intermediate 14 min read

HAProxy Production Hardening: Beyond TLS, Request Filtering, ACLs, and Logging Hygiene

HAProxy's defaults are friendly to misconfiguration. The right knobs make it fast, observable, and resistant to common L7 abuse.

advanced 14 min read

Service Mesh Egress Gateway Patterns: Bounded Outbound Traffic in Istio Clusters

Pod egress in a service mesh is a per-Pod decision; egress gateways centralize, audit, and bound it. The pattern that finally makes 'where can my workload reach' answerable.

intermediate 14 min read

WireGuard Mesh for Internal Zero-Trust Networking: wg-quick, Tailscale, Netbird Compared

WireGuard turns the public Internet into an internal network. Three deployment patterns, three different operational models, one cryptographic core.

advanced 14 min read

eBPF-XDP for L4 DDoS Mitigation: Line-Rate Drop in the Kernel

XDP runs your filter at the network driver level, before the kernel allocates an sk_buff. Drop attacks at line rate on commodity NICs with a few hundred lines of eBPF.

intermediate 14 min read

Encrypted Client Hello (ECH) Deployment on NGINX, Cloudflare, and Internal Edges

TLS 1.3 still leaks the destination hostname via SNI. ECH closes that gap. Browser support is now wide enough to deploy in production.

intermediate 13 min read

HTTP/2 RST and CONTINUATION Flood Mitigation: CVE-2023-44487, CVE-2024-27316, and Beyond

Two recent CVE classes weaponize HTTP/2's stream and header model. Mitigation is settings-tweak in NGINX and Envoy, but only if you know which knobs.

intermediate 16 min read

HTTP/3 and QUIC Production Hardening: UDP Amplification, 0-RTT Replay, and Connection ID Privacy

QUIC moves TLS into the transport. New attack surface: UDP amplification, 0-RTT replay, connection ID tracking, stream flow-control abuse. Hardening is non-trivial.

advanced 24 min read

DDoS Megascale Operations: Defending Against AI-Orchestrated Terabit Attacks and Botnet Smokescreens

AI-powered botnets of compromised IoT and edge devices launch DDoS attacks exceeding 1 terabit per second. These attacks are increasingly used as smokescreens for simultaneous data theft operations. This article covers the multi-layer defensive architecture from edge absorption to origin hardening.

intermediate 18 min read

IPv6 Security in Production: Hardening Dual-Stack Deployments

Most production environments run dual-stack (IPv4 and IPv6) whether the team intended it or not. Linux enables IPv6 by default.

intermediate 20 min read

gRPC API Gateway Patterns: Authentication, Rate Limiting, and Request Validation at the Edge

gRPC services exposed through API gateways face unique security challenges: gRPC-Web transcoding introduces injection surfaces, metadata headers can carry internal routing information past the edge, and per-method rate limiting requires gRPC-aware configuration.

intermediate 20 min read

NGINX Hardening Beyond TLS: Request Filtering, Buffer Limits, and Connection Controls

Most NGINX hardening guides stop at TLS configuration, cipher suites, certificate setup, HSTS.

intermediate 20 min read

Rate Limiting at the Ingress Layer: NGINX, Envoy, and Cloud Load Balancers Compared

Rate limiting is the first line of defence against abuse, credential stuffing, API scraping, and denial-of-service attacks.

intermediate 22 min read

Protecting Internal APIs: Network Segmentation, Authentication, and Access Logging

"It's internal" is the most dangerous phrase in infrastructure security. Internal APIs sit behind the perimeter and receive minimal scrutiny.

intermediate 18 min read

Load Balancer Security: Health Check Abuse, Connection Draining, and TLS Termination

Load balancers sit at the most critical point in your infrastructure: every external request passes through them.

intermediate 22 min read

API Gateway Security: Authentication, Authorization, and Request Validation

Without a centralized API gateway, authentication and authorization logic is duplicated in every backend service. This creates several problems:

intermediate 18 min read

TLS 1.3 Configuration for NGINX and Envoy: Ciphers, Certificates, and OCSP Stapling

TLS misconfiguration remains one of the most common security findings in production infrastructure.

intermediate 22 min read

mTLS for Service-to-Service Communication: Istio, Linkerd, and DIY with cert-manager

Internal service-to-service traffic in most Kubernetes clusters is plaintext. Once an attacker compromises a single pod, through a container escape,...

intermediate 18 min read

gRPC Load Balancing Security: Client-Side, Proxy, and Service Mesh Patterns

L4 load balancers break gRPC multiplexing, sending all streams to a single backend. This article covers L7 balancing with Envoy, client-side balancing with xDS, health check hardening, and connection draining for secure gRPC deployments.

intermediate 18 min read

DNS Security for Production Infrastructure: DNSSEC, CAA Records, and Internal Resolution

DNS is the most critical single point of failure in any infrastructure, and the least hardened layer for most teams.

intermediate 22 min read

WAF Rule Tuning That Does Not Break Legitimate Traffic: ModSecurity and Coraza in Practice

A self-managed Web Application Firewall (WAF) with default rules generates dozens of false positives per day.

intermediate 20 min read

Preventing HTTP Request Smuggling: Configuration for NGINX, HAProxy, and Envoy

HTTP request smuggling exploits inconsistencies in how chained HTTP processors (reverse proxies, load balancers, backend servers) parse request...

intermediate 18 min read

HTTP Security Headers in Production: CSP, HSTS, and Permissions-Policy Without Breaking Your App

Security headers are free, server-side controls that instruct browsers to restrict dangerous behaviour.

intermediate 18 min read

Hardening WebSocket Connections: Authentication, Rate Limiting, and Origin Validation

WebSocket connections start as an HTTP upgrade request and then persist as a long-lived, full-duplex channel.

intermediate 22 min read

gRPC Security in Production: TLS, Authentication, and Interceptor-Based Access Control

gRPC services in production frequently run with security configurations that would never be acceptable for HTTP APIs:

CI/CD & Supply Chain

intermediate 13 min read

Just-in-Time CI Access for Production Deploys: Approval Flows and Bounded Permissions

Standing CI permissions are a liability. JIT mints production permissions only at deploy time, with explicit approval and short lifetime.

intermediate 13 min read

Renovate and Dependabot Security Configuration: Auto-Merge Boundaries and Scope Rules

Bots that update dependencies are great until one auto-merges a malicious release. The defaults are safe-ish; the configuration that makes them production-safe is more deliberate.

intermediate 13 min read

GitHub Apps vs PATs vs Deploy Keys vs OIDC: Choosing the Right SCM Identity

Four identity types, four very different scope/lifetime/permission models. Pick wrong and you ship the wrong-shaped credential to every CI run for years.

advanced 14 min read

Ephemeral CI Runners with Firecracker and Kata: VM-Level Isolation for Build Jobs

Container-based CI runners share a host kernel. Firecracker and Kata give each job its own kernel and a fresh VM — large blast-radius reduction, modest cost.

intermediate 15 min read

OIDC Federation Hardening: Locking Down CI-to-Cloud Trust Policies

OIDC federation between CI and cloud removes long-lived secrets. The trust policies that grant the access are the new attack surface, and most are too loose.

intermediate 14 min read

Branch Protection and Repository Policy as Code: Terraform GitHub for Hundreds of Repos

Hand-clicking branch protection rules across 200 repos guarantees drift. Terraform + the github provider + a shared module makes it auditable, reviewable, and reversible.

intermediate 15 min read

CI/CD Pipeline Egress Control: Runner Network Isolation, Allowlists, and Supply-Chain Exfiltration Defense

Most build pipelines run with unrestricted outbound internet. A single compromised dependency exfiltrates secrets, tokens, and source code in seconds.

advanced 24 min read

Software Supply Chain and Third-Party Exposure: Defending Against Upstream Compromise

Attackers no longer need to breach you directly when they can compromise a vendor, open-source library, or managed service provider that you trust. A single poisoned dependency can cascade into thousands of downstream organisations. This article covers the controls that detect and contain supply chain compromise.

intermediate 16 min read

Secret Management in CI/CD Pipelines: Vault, SOPS, and OIDC Federation

Static credentials in CI/CD pipelines are the leading cause of secret sprawl. Teams store long-lived API keys, database passwords, and cloud provider.

intermediate 14 min read

Software Bill of Materials (SBOM) Generation and Consumption in CI/CD

SBOM generation is easy, run Syft, get a list of every package in your container image.

intermediate 16 min read

Terraform Security: State File Protection, Provider Pinning, and Plan Review Automation

Terraform state files contain every secret, IP address, and configuration detail of your infrastructure in plaintext JSON.

intermediate 16 min read

Container Registry Security: Access Control, Vulnerability Scanning, and Garbage Collection

Container registries store the most sensitive artifacts in your deployment pipeline.

intermediate 14 min read

Pipeline-as-Code Security: Preventing CI Configuration Tampering

CI/CD pipeline definitions live alongside application code in Git.

intermediate 17 min read

Hardening Helm Values: Schema Validation, Secret Injection, and Security Defaults

Helm values files control security-critical Kubernetes fields like security contexts, image references, and resource limits. Without schema validation, a single misconfigured value can deploy a privileged container or pull an unscanned image.

intermediate 18 min read

Securing CI/CD Runners: Isolation, Credential Scoping, and Ephemeral Environments

CI/CD runners are the most privileged, least monitored components in most infrastructure.

intermediate 14 min read

Securing Helm Charts: Chart Signing, Value Injection, and Template Security

Helm is the dominant package manager for Kubernetes, but most teams install charts without verifying provenance, pass unvalidated values that end up...

intermediate 16 min read

Helm Supply Chain Security: OCI Registries, Provenance Verification, and Chart Mirroring

Helm charts pulled from public repositories are unsigned, unverified, and executed with whatever permissions their templates request. This article covers OCI-based chart storage, cosign signing and verification, chart mirroring for airgapped environments, and Kyverno policies to enforce signed charts.

advanced 16 min read

Artifact Integrity Verification: Checksums, Signatures, and Transparency Logs

Build artifacts pass through multiple stages between source code and production deployment.

intermediate 16 min read

Securing GitHub Actions: Permissions, Pinning, and Workflow Injection Prevention

GitHub Actions is the most widely used CI/CD platform, but its security model is scattered across dozens of documentation pages.

intermediate 14 min read

Dependency Pinning and Lockfile Integrity: Preventing Supply Chain Attacks in CI

Dependency confusion and typosquatting attacks exploit the gap between "I declared a dependency" and "I verified the dependency I got." Version pinning...

advanced 15 min read

Reproducible Builds for Container Images: Achieving Deterministic Output

Two builds from the same source code should produce the same container image. In practice, they almost never do.

intermediate 16 min read

GitOps Security Model: Separation of Duties, Drift Detection, and Rollback Controls

GitOps centralizes deployment authority in Git repositories. Tools like ArgoCD and Flux watch Git repositories and reconcile cluster state to match...

advanced 16 min read

SLSA Provenance for Container Images: From Build to Admission Control

Without provenance, you cannot prove where a container image came from, what source code it was built from, or whether the build process was tampered...

AI & Security Landscape

intermediate 14 min read

AI Agent Observability and Tracing: OpenTelemetry for Agent Runs and Tool Calls

An agent's run is a graph of model calls, tool invocations, and decisions. Observability that maps cleanly to that graph is the difference between debugging and guessing.

advanced 14 min read

AI Model Output Watermarking: Provenance for Generated Text and Code

SynthID, the Aaronson scheme, and lexical watermarks embed signatures in model output. Detection works statistically. None survives heavy editing — useful but bounded.

advanced 14 min read

Continuous AI Red-Teaming Pipelines: Automated Adversarial Testing in CI

Manual red-teaming finds gaps once. Continuous pipelines find regressions every model upgrade. The infrastructure exists; most teams haven't wired it up.

intermediate 16 min read

C2PA Content Credentials: Cryptographic Provenance for AI-Generated Media in Production

Synthetic media is now indistinguishable from camera output. Content Credentials are the practical defense — signed manifests embedded in the file itself.

intermediate 14 min read

MCP Authentication Patterns: OAuth 2.1, Capability Tokens, and Per-Tool Authorization

MCP servers expose tool surfaces to LLM agents. The auth model decides what an agent can do — and most deployments leave it underspecified.

advanced 14 min read

Prompt Cache Security: Side-Channels, Poisoning, and Tenant Isolation in LLM Provider Caches

Provider-side prompt caching speeds up applications by 30-90% — and introduces a new attack surface with timing side-channels and poisoning vectors.

advanced 18 min read

Agent Memory Poisoning: Defending the Persistence Layer of Long-Running LLM Agents

Agents with long-term memory survive across sessions. Anything poisoned into that memory persists. A one-shot prompt injection becomes a permanent behavioural change.

advanced 26 min read

AI-Adaptive Malware: How Modern Payloads Change Behaviour Based on Their Environment and How to Defend Against Them

A modern virus is not the same as a virus from five years ago. AI-generated payloads observe their environment, profile the host, detect sandboxes, adapt their persistence mechanism to the OS they land on, and modify their C2 communication to blend with normal traffic. Every instance is unique. This article covers how adaptive malware works and the defensive controls that defeat it.

advanced 24 min read

Running AI-Powered Security Assessments on Your Own Infrastructure: Using Frontier Models Before Attackers Do

If Anthropic's Mythos can find your vulnerabilities, so can every attacker with API access. The only rational response is to find them first. This article covers how to run systematic AI-powered security assessments across your code, infrastructure-as-code, and runtime configuration.

intermediate 22 min read

Defending Against AI-Amplified Social Engineering: Phishing, Voice Cloning, and Deepfake Impersonation

Generative AI has eliminated every traditional indicator of phishing: perfect grammar, personalised context, cloned executive voices, and real-time video deepfakes. This article covers the defensive controls that work when human judgement alone cannot distinguish real from fake.

advanced 22 min read

Mythos and the Vulnerability Classes AI Finds First: Eliminating Your Highest-Risk Attack Surface

Frontier AI models like Anthropic's Mythos find vulnerability classes that traditional scanners miss: logic flaws, implicit trust, hardcoded secrets, configuration drift. The defensive response is not faster patching. It is eliminating these classes before they are discovered.

advanced 16 min read

Training Data Extraction Prevention: Stopping Models from Leaking Memorised Data

Large language models memorise portions of their training data. Given the right prompt, a model will reproduce training examples verbatim, including..

advanced 16 min read

Model Extraction Prevention: Detecting and Blocking Model Stealing Through API Queries

Model extraction (model stealing) is an attack where an adversary queries a production ML API systematically to reconstruct a functionally equivalent...

advanced 20 min read

Securing AI Agents in Production: Tool-Use Boundaries, Credential Scoping, and Output Verification

AI agents are being deployed with production tool access: shell execution, kubectl, terraform apply, database queries, API calls.

advanced 19 min read

Building an AI Governance Pipeline: Automated Checks from Training to Production

AI governance in most organisations is a manual process. A model is trained, someone writes a document, a committee meets, approvals are collected...

advanced 16 min read

AI Supply Chain Attack Surface: Models, Datasets, and Inference Dependencies

AI systems introduce a supply chain attack surface that traditional software security does not cover. The three new vectors are.

advanced 18 min read

EU AI Act Compliance for Infrastructure Teams: Risk Classification, Documentation, and Technical Controls

The EU AI Act entered into force in August 2024, with enforcement timelines staggered through 2027.

advanced 19 min read

MCP Tool Permission Patterns: Least Privilege, Approval Workflows, and Scope Boundaries

MCP servers expose tools that agents invoke. Without fine-grained permissions, every connected agent can call every tool. This article covers least privilege patterns, per-client allowlists, human approval gates, audit logging, multi-tenant isolation, and capability tokens.

advanced 22 min read

Claude for Application Security: Finding Logic Vulnerabilities in Source Code

Static application security testing (SAST) tools find pattern-based vulnerabilities effectively. Semgrep matches code against rules.

advanced 18 min read

Auditing AI Actions at Scale: Building Tamper-Proof Logs for Non-Human Actors

AI agents operate at machine speed, generating 10-100x the audit data of human operators.

advanced 18 min read

MCP Transport Security: Securing stdio, SSE, and HTTP Channels for Model Context Protocol

MCP supports three transport types: stdio, SSE, and HTTP. Each has distinct security characteristics. This article covers transport-level hardening for all three, including process isolation, TLS, mTLS, CORS, reverse proxy configuration, and rate limiting.

advanced 22 min read

Claude for Kubernetes Security Auditing: Finding Privilege Escalation Paths Scanners Cannot See

Kubernetes security scanners evaluate resources individually. Tools like kube-bench check node configurations against CIS benchmarks.

advanced 16 min read

LLM Jailbreak Defence: Detecting and Preventing System Prompt Bypasses in Production

LLM jailbreaks are inputs that cause a model to ignore its system prompt, safety training, or usage policies.

advanced 18 min read

Verifying AI Agent Output: Deterministic Checks, Human-in-the-Loop Gates, and Rollback Safety

AI agents generate infrastructure configurations, database migrations, deployment manifests, and shell commands. It passes a casual review.

advanced 18 min read

Securing MCP Servers: Authentication, Tool Sandboxing, and Input Validation for Model Context Protocol

The Model Context Protocol (MCP) gives AI agents structured access to tools: filesystem operations, database queries, API calls, shell commands.

intermediate 20 min read

Claude for Infrastructure-as-Code Security Review: Terraform, CloudFormation, and Pulumi

Infrastructure-as-Code scanners like Checkov, tflint, and cfn-lint enforce policy through pattern matching.

advanced 19 min read

LLM Prompt Security Patterns: System Prompt Protection, Input Sanitisation, and Context Isolation

LLM applications are vulnerable to prompt injection, system prompt leakage, and cross-user context contamination. This article covers system prompt hardening, input sanitisation, output filtering, and context isolation for multi-tenant deployments.

advanced 19 min read

Algorithmic Auditing: Testing AI Systems for Bias, Fairness, and Safety Before Deployment

AI systems make decisions that affect people: who gets approved for a loan, whose resume gets shortlisted, which content gets flagged, whose...

intermediate 18 min read

Claude, Mythos, and the Non-Human Infrastructure Consumer: Writing Hardening Guides for AI Agents

AI models are no longer just tools that engineers use to write code. They are becoming direct infrastructure consumers:

advanced 18 min read

Detecting AI-Generated Attacks: Moving from Signatures to Behavioural Baselines

Signature-based detection (WAF CRS rules, static Falco rules, antivirus signatures) matches "known bad." AI-generated attacks are polymorphic, every...

advanced 16 min read

Adversarial Attacks on Embeddings: Poisoning Vector Stores and Manipulating Semantic Search

Embedding-based retrieval powers RAG pipelines, semantic search, recommendation systems, and classification.

advanced 16 min read

AI-Powered Vulnerability Discovery: What Automated Code Analysis Means for Your Patch Cycle

AI models can now discover exploitable vulnerabilities in source code faster than human researchers.

advanced 18 min read

Agent-to-Agent Trust: Authentication, Delegation, and Capability Boundaries in Multi-Agent Systems

Multi-agent systems are moving from research demos to production deployments. A coordinator agent delegates tasks to specialist agents: one handles...

advanced 20 min read

Securing LLM Deployments: Model Loading, Runtime Isolation, and Inference Infrastructure

Deploying LLMs in production introduces infrastructure security challenges: model integrity verification, GPU isolation, runtime sandboxing, API authentication, and safe model updates. This article covers the full inference deployment security stack.

advanced 20 min read

The Threat Model Has Changed: Rewriting Security Assumptions for an AI-Augmented World

Every security architecture is built on assumptions about what attackers can do, how fast they can do it, and at what scale.

intermediate 16 min read

AI Model Cards in Production: Documenting Capabilities, Limitations, and Security Properties

Every production AI model has boundaries: input domains where it performs well, edge cases where it fails, and security properties that constrain how...

advanced 16 min read

Hardening the AI Control Plane: Kill Switches, Rate Limits, and Human-in-the-Loop Gates

AI agents with write access to production systems can execute 100+ infrastructure changes per minute.

advanced 20 min read

How AI Is Compressing the Attacker Timeline: What Defenders Need to Change Now

The gap between vulnerability disclosure and weaponised exploit used to be measured in weeks.

advanced 16 min read

Membership Inference Defence: Preventing Attackers from Determining Training Data Inclusion

Membership inference attacks determine whether a specific data record was used to train a model.

advanced 18 min read

Sandboxing AI Agent Tool Use: Filesystem, Network, and Process Isolation for Autonomous Actions

AI agents execute tool calls on real infrastructure: writing files, running shell commands, making HTTP requests, modifying databases.

intermediate 18 min read

Claude for Security Detection: How Large Language Models Find What Scanners Miss

Traditional security scanners operate on pattern matching. They check for known CVEs in dependency trees, match regex patterns for hardcoded secrets,...

intermediate 14 min read

Using AI to Harden Systems: Automated Configuration Review and Remediation

Manual security review of infrastructure-as-code takes 2-4 hours per pull request for complex changes.

advanced 18 min read

AI Credential Delegation: Short-Lived Tokens, Scope Narrowing, and Audit Trails for Agent Access

AI agents need credentials to do useful work: database passwords, API keys, Kubernetes service account tokens, cloud IAM roles.

advanced 18 min read

AI Incident Reporting: Detection, Classification, and Response Procedures for AI System Failures

Traditional incident response assumes failures are binary: the service is up or it is down, the response is correct or it throws an error.

intermediate 20 min read

Claude for Security Incident Triage: Rapid Analysis of Logs, Alerts, and Blast Radius

When a security alert fires at 2 AM, the on-call engineer faces an information overload problem.

Observability & Detection

intermediate 13 min read

Alert Deduplication and Correlation Patterns: Beating Alert Fatigue at Scale

Per-rule grouping and fingerprint-based dedup get you from 10,000 alerts/day to 200. Correlation across signals is the next jump — to 30 actionable incidents.

intermediate 14 min read

Forensic Readiness: Log Retention, Capture, and Chain of Custody for Incident Response

What you don't capture, you can't investigate. Forensic readiness is the discipline of designing the logging layer so post-incident you have what you need.

intermediate 14 min read

Security SLOs and Error Budgets: SRE Discipline Applied to Detection and Response

Treat security as a service: define SLIs (detection coverage, MTTD), set SLOs, track burn rate. The same discipline that makes reliability measurable makes security measurable.

intermediate 14 min read

Detection Engineering Metrics: MTTD, MTTR, Signal-to-Noise, and Coverage Tracking

If you cannot measure your detection program, you cannot improve it. The metrics that matter, how to compute them, and what they trigger when they shift.

intermediate 14 min read

OpenTelemetry PII Leakage: Stopping Sensitive Data in Span Attributes, Baggage, and Logs

OTel traces capture authorization headers, URL params, internal IDs, and database query strings by default. Without redaction, your traces are an exfiltration target.

intermediate 14 min read

SIEM Cost Optimization: Cardinality, Retention, Sampling, and Index-Tier Strategy

SIEM bills double yearly because nobody owns the spend. Cardinality control, retention tiering, and sampling reduce cost 40-70% without losing detection.

intermediate 15 min read

Detection-as-Code with Sigma: Versioned, Tested, Vendor-Neutral SIEM Rules

Detection logic scattered across SIEM consoles and shell scripts does not scale. Sigma rules in Git, tested in CI, converted to any backend on deploy, do.

intermediate 18 min read

Securing the OpenTelemetry Collector: Deployment Patterns, TLS, and Access Control

The OpenTelemetry Collector processes every trace, metric, and log in your infrastructure. A compromised Collector leaks all observability data.

intermediate 14 min read

Security Dashboards That Engineers Actually Use: Grafana Designs for Hardening Verification

Most security dashboards are vanity metrics, total alerts this month, pie charts of vulnerability severity, traffic heatmaps that look impressive but.

advanced 16 min read

OpenTelemetry for Security: Distributed Tracing of Authentication and Authorization Flows

Distributed tracing is standard for performance debugging, but almost no team uses it for security.

intermediate 18 min read

OpenTelemetry Collector Pipelines: Securing Receivers, Processors, and Exporters

An OTel Collector pipeline with default settings forwards every attribute, header, and trace to your backend with no filtering or authentication.

advanced 18 min read

Lateral Movement Detection: Network Patterns, Authentication Anomalies, and Alert Correlation

East-west traffic inside a Kubernetes cluster is a blind spot for most security teams.

intermediate 18 min read

Security-Relevant Prometheus Metrics: What to Collect, How to Alert, When to Page

Prometheus is deployed in most Kubernetes environments for infrastructure monitoring (CPU, memory, disk, request latency.

advanced 18 min read

eBPF-Based Security Monitoring: Tetragon for Process, Network, and File Observability

Falco monitors syscalls for runtime detection. Tetragon (CNCF/Cilium) goes deeper: it monitors process execution, network connections, and file...

advanced 16 min read

Log Integrity and Tamper Detection: Ensuring Your Audit Trail Is Trustworthy

An attacker's first post-compromise action is covering their tracks. On a Linux host, this means deleting /var/log/audit/audit.log, clearing journal..

advanced 18 min read

Container Escape Detection: Runtime Signals, Kernel Indicators, and Response Automation

Container escapes are the highest-impact attack in Kubernetes. A single compromised pod that escapes its container gains access to the underlying...

advanced 16 min read

Kubernetes Audit Log Pipeline Design: From API Server to SIEM

Kubernetes audit logging at the RequestResponse level captures everything: every API call, every request body, every response payload.

intermediate 15 min read

Crypto Mining Detection: CPU Patterns, Network Signatures, and Automated Response

Cryptojacking is the most common post-compromise activity in Kubernetes environments.

advanced 18 min read

Building Detection Rules That Don't Cry Wolf: Alert Design for Security Events

Security detection that generates 50+ false positives per day is worse than no detection, it trains the team to ignore alerts.

intermediate 15 min read

Certificate Expiry Monitoring: Automated Detection Across TLS, mTLS, and Signing Certificates

Certificate expiry is the most common cause of preventable production outages. When a TLS certificate expires, HTTPS connections fail, mTLS...

intermediate 17 min read

Incident Response Runbooks: Structured Procedures for Common Security Events

Detection without documented response is security theatre. Most teams have alerts that fire at 3 AM, but no written procedure for what the on-call...

intermediate 20 min read

Centralized Logging Architecture for Security: Fluentd, Vector, and Loki Compared

Self-managed log infrastructure is one of the highest operational costs for small-to-medium teams.

advanced 22 min read

Building a Security Audit Log Pipeline That Scales: auditd to Elasticsearch

Linux audit logs are the ground truth for security investigation. auditd captures kernel-level events that no userspace tool can see: file access by...

WebAssembly

advanced 13 min read

WASM Cold-Start Optimization for Security Workloads: Pre-Compilation, Snapshots, and AOT

Security-side WASM (auth filters, policy engines, MCP plugins) must be sub-millisecond to deploy at request rate. Pre-compilation and snapshotting get you there.

advanced 14 min read

WASM in IoT and Embedded Production: wasmEdge, wasm3, WAMR, and OTA Update Security

WASM lets you ship logic to constrained devices without firmware updates. The runtime, the trust model, and the OTA pipeline all need careful design.

advanced 14 min read

WASM Plugin Architecture Threat Modeling: Trust Boundaries, Host-API Exposure, and Supply Chain

Plugin systems built on WASM have a recurring shape. Threat-modeling that shape catches the structural mistakes before deployment.

intermediate 14 min read

Edge Runtime WASM Hardening: Cloudflare Workers, Fastly Compute, and Multi-Tenant Isolation

Edge runtimes execute untrusted customer code in shared processes. The hardening contract is the platform's, but the customer code's behavior decides the blast radius.

intermediate 14 min read

Envoy and Istio WASM Plugin Hardening: Resource Limits, ABI Selection, and Distribution

WASM plugins run inline in the data path. A misconfigured plugin can exhaust memory, leak tenant data, or crash the proxy. The defaults need explicit caps.

intermediate 15 min read

NGINX WASM Filters with ngx_wasm_module: Request-Path Plugins, Resource Caps, and Distribution

ngx_wasm_module brings the proxy-wasm protocol to NGINX. Plugin authoring is similar to Envoy, but the worker model and hardening surface differ.

intermediate 13 min read

Reproducible WASM Builds and SBOM Generation: Deterministic Compilation, CycloneDX, In-Toto Attestations

WASM is the easy case for reproducibility — no dynamic linking, no runtime variance. Most teams still ship non-reproducible builds. The fix is small.

intermediate 14 min read

WASI HTTP Server Hardening: Production Patterns for wasi:http/incoming-handler

WASI HTTP servers are a clean platform-neutral pattern. The hardening is at the application layer — body limits, header allowlists, response shaping, and panic semantics.

advanced 16 min read

WASI Preview 2 Capability-Based Security: filesystem, sockets, http, and the Component Model

Preview 2 replaces Preview 1's coarse imports with explicit, scoped, capability-passing interfaces. The security story is the actual reason to migrate.

advanced 14 min read

WASI Sockets API Hardening: TCP, UDP, and TLS Capability Scoping for Network-Bound WASM

wasi:sockets/tcp and wasi:sockets/udp give WASM modules network access. The capability model is fine-grained — most embedders use it as a coarse on/off switch.

advanced 14 min read

WASM AI Inference: Isolating ONNX Runtime Web, llama.cpp WASM, and On-Device Models

Running AI inference inside WASM is a new deployment pattern with novel isolation properties. The threat model differs from GPU-served inference.

advanced 14 min read

WASM Component Model Security Boundaries: Composition, Capability Passing, and Trust Decisions

When you compose multiple components, every wire is a capability decision. The security story of a composed application lives in the WIT between components.

advanced 14 min read

WASM in Databases: pg_wasm, ClickHouse UDFs, SurrealDB Extensions

Databases are growing WASM extension points. The threat model spans both WASM-runtime escape and database-internal lateral access — different from container UDFs.

advanced 15 min read

WASM Multi-Tenancy Patterns: Resource Quotas, Fair Scheduling, and Tenant Isolation Failures

Running many tenants' WASM modules in one runtime is the hard case. Per-tenant fairness, isolation guarantees, and the failure modes that violate both.

intermediate 14 min read

OCI WASM Module Signing and Verification: cosign, notation, and Admission-Time Enforcement

WASM modules ride OCI registries the same as containers. The supply-chain hygiene story is the same — and most orgs do not apply it to .wasm artifacts.

advanced 16 min read

WASM Workloads on Kubernetes: runwasi, Spin, and the Threat Model Shift from OCI Containers

WASM on Kubernetes via runwasi and containerd shims runs alongside containers but with a different escape surface, different RBAC implications, and different supply-chain controls.

intermediate 14 min read

WASM Module Static Analysis and Vulnerability Scanning: wasm-tools, twiggy, and CVE Detection

Scanning .wasm artifacts is different from scanning containers — no rootfs, no package manager. The dependency graph is in the bytecode.

advanced 16 min read

Wasmtime Production Hardening: Fuel, Memory, Epoch Interrupts, and WASI Capability Allowlists

Wasmtime's defaults are friendly, not safe. Untrusted modules need explicit limits on CPU, memory, syscall surface, and filesystem access.

advanced 14 min read

Wazero Hardening for Go Embedders: Resource Limits, WASI Capabilities, and Plugin Isolation

Wazero is the pure-Go WASM runtime used by Tetragon, Cilium, k6, Trivy, and dapr. The defaults are friendly; production deployments need explicit caps.