Kubernetes / Platform Security Articles
Kubernetes hardening guides covering RBAC, network policies, admission control, secrets management, runtime security, and AI workloads.
Kubernetes Security and Hardening Guides
CSI Driver Security: Volume-Mount Hardening, Privileged Drivers, and Inline Ephemeral Volumes
CSI drivers run with broad privileges by design. Their security posture often goes unaudited — until one is the exfil path or the privilege-escalation step.
External Secrets Operator: Pulling Secrets from KMS, Vault, and Cloud Stores into Kubernetes
Native Kubernetes Secrets are visible to anyone with namespace get. External Secrets Operator pulls from your real secret store on schedule, with rotation and audit.
Native Sidecar Containers in Kubernetes 1.29+: Lifecycle, Security, and Mesh Migration
restartPolicy: Always init containers GA'd in 1.29 fix the long-standing init/main race. Bigger security wins for service-mesh and log-shipper deployments.
Confidential Containers on Kubernetes: AMD SEV-SNP, Intel TDX, and the Attestation Flow
Confidential Containers move workload isolation from the kernel to the silicon. Encrypted memory, hardware-attested boot, and a different threat model than user namespaces.
User Namespaces for Pods: UID Remapping, Container Escape Defense, and the GA Path in Kubernetes 1.30+
userns: true remaps Pod UIDs into a per-Pod range. A container running as root sees uid 0 inside; the host sees an unprivileged user. Big hardening win, easy to enable.
ValidatingAdmissionPolicy with CEL: Native Kubernetes Admission Without Webhooks
VAP replaces webhook admission for the policies you write most often. No Kyverno, no OPA, no network round-trip, no webhook availability risk.
Gateway API Security Patterns: Multi-Team Routing, ReferenceGrant, and Delegated Trust on Kubernetes
Gateway API replaces Ingress with a multi-role model that separates infrastructure, cluster operator, and application developer concerns. New surface, new threat model.
LLMs on Kubernetes: Understanding the Threat Model and Deploying an LLM Gateway
Kubernetes orchestrates LLM workloads but has no awareness of what those workloads do. An Ollama pod with healthy readiness probes and stable resource usage can still leak secrets, execute prompt injection, and grant models excessive agency over internal services. This article covers the LLM-specific threat model for Kubernetes and implements an LLM gateway as the policy enforcement layer.
Kubernetes Node Hardening: From OS Configuration to kubelet Lockdown
A Kubernetes node is a Linux machine running kubelet, a container runtime, and your workloads.
GPU Workload Isolation: MIG, MPS, and vGPU Security Boundaries
Multi-tenant GPU sharing without isolation risks data leakage between workloads through shared GPU memory.
GPU Cost and Security Monitoring: Detecting Abuse and Optimising Spend
GPU compute costs between $2 and $30 per hour per device. A single unauthorised cryptocurrency mining pod running on an A100 for a weekend generates..
LLM Rate Limiting in Production: Token Budgets, Per-User Quotas, and Abuse Detection
Request-count rate limiting fails for LLM workloads because a single request can consume 100K tokens. Token-based rate limiting with per-user quotas and abuse detection prevents runaway costs and catches prompt injection probing before it escalates.
Runtime Security with Falco on Kubernetes: Rules, Tuning, and Response Automation
Prevention-only security has a binary failure mode: either the control holds and the attacker is stopped, or the control fails and the attacker...
Kubernetes Network Policies That Actually Work: From Default Deny to Microsegmentation
By default, every pod in a Kubernetes cluster can communicate with every other pod across all namespaces. There are no network boundaries.
LLM Cost Controls: Budget Enforcement, Token Metering, and Spend Alerting
Without enforced budgets, a single team can exhaust an organization's entire AI spend in days. Token metering with per-team budgets, automatic request rejection at limits, model routing by cost, and chargeback dashboards turn LLM spending from a surprise into a managed line item.
Kubelet Security Configuration: Authentication, Authorization, and Read-Only Port
The kubelet runs on every node in the cluster with root-level access to the container runtime, all pod specifications, mounted secrets, and the host..
Kubernetes RBAC Design Patterns: Least Privilege Without Paralysing Developers
RBAC sprawl in multi-team Kubernetes clusters grows past 100 role bindings within months.
Kubernetes Secrets Management: External Secrets Operator, Vault, and Sealed Secrets
Kubernetes Secrets are base64-encoded, not encrypted. Anyone with RBAC read access to secrets in a namespace can decode every credential stored there.
AI Incident Forensics: Reconstructing What an AI System Did, Why, and What Data It Accessed
When a traditional application causes an incident, you examine logs, traces, and database queries to reconstruct what happened.
Hardening Model Inference Endpoints: Authentication, Rate Limiting, and Input Validation
Model inference endpoints are GPU-backed and expensive, $2-30 per hour per GPU. A single unprotected endpoint exposed to the internet can accumulate..
Kubernetes Admission Control: From PodSecurity Standards to Custom OPA/Kyverno Policies
Without admission control, any user with deployment permissions can run privileged containers, mount the host filesystem, use the host network, run...
AI Data Leakage Prevention: Input Filtering, Output Scanning, and Audit Trails
AI systems leak data in ways traditional applications do not. A language model trained on customer data can reproduce verbatim customer records in...
Jupyter Notebook Security: Authentication, Isolation, and Data Protection
JupyterHub is a code execution platform. Every notebook cell is arbitrary code running with whatever permissions the notebook server process has.
Multi-Tenancy Hardening in Kubernetes: Namespace Isolation, Resource Quotas, and Network Boundaries
Kubernetes namespaces provide logical separation, not security isolation. By default, pods in namespace A can send network traffic to pods in...
Building a Content Filtering Pipeline for LLM Applications: From Raw Input to Safe Output
A single content filter is not a pipeline. Most LLM deployments add one filter (usually on output) and call it done.
AI Red Teaming Methodology: Structured Adversarial Testing for LLM Applications
Traditional security testing (penetration testing, vulnerability scanning) does not cover AI-specific attack surfaces.
Kubernetes Image Policy Enforcement: Cosign, Notation, and Admission Webhooks
Without image policy enforcement, any container image from any registry can run in a Kubernetes cluster.
Securing RAG Pipelines: Vector Database Access Control, Document Poisoning, and Retrieval Filtering
Retrieval-Augmented Generation (RAG) adds a knowledge base to LLM applications, the model retrieves relevant documents before generating a response.
Pod Security Context Deep Dive: runAsNonRoot, readOnlyRootFilesystem, and Capabilities
Kubernetes SecurityContext has over 15 configurable fields, but most teams only set runAsNonRoot: true and consider the job done.
Vector Database Security: Access Control, Embedding Protection, and Query Isolation
Vector databases are the backbone of RAG (Retrieval-Augmented Generation) systems.
A/B Model Deployment Safety: Canary Rollouts, Traffic Splitting, and Automated Rollback for ML Models
Deploying a new ML model version is not the same as deploying a new application version.
Kubernetes API Server Hardening: Flags, Authentication, and Audit Logging
The API server is the front door to the Kubernetes cluster. Every kubectl command, every controller reconciliation, every pod scheduling decision,...
Seccomp Profiles for Production Workloads: Writing, Testing, and Deploying Custom Profiles
The default container runtime allows approximately 300 syscalls. A compromised container can use unshare to create new namespaces, clone to spawn...
etcd Encryption at Rest: Configuration, Key Rotation, and Performance Impact
Kubernetes Secrets are stored in etcd as base64-encoded plaintext. Base64 is an encoding, not encryption.
Implementing AI Guardrails: Input Validation, Output Filtering, and Safety Classifiers in Production
Deploying an LLM without guardrails is deploying an application where any user can make it say or do anything.
Hardening Kubernetes Ingress Controllers: NGINX, Traefik, and Envoy Compared
The ingress controller is the internet-facing entry point to a Kubernetes cluster.
LLM Observability in Production: Monitoring Latency, Token Usage, Safety Violations, and Drift
Traditional application monitoring (CPU, memory, HTTP status codes, latency) tells you nothing about what an LLM is doing.
Hardening Model Serving Frameworks: TorchServe, Triton, and vLLM Security Configuration
Model serving frameworks ship with defaults optimised for development: management APIs exposed on all interfaces without authentication, model files..
Securing Fine-Tuning Pipelines: Data Isolation, Checkpoint Integrity, and Access Control
Fine-tuning pipelines are high-value targets. They consume expensive GPU hours, process proprietary training data, and produce model checkpoints that...
Hardening the Kubernetes Scheduler: Topology Constraints and Security-Aware Placement
The Kubernetes scheduler places pods on nodes based on resource availability and basic constraints.
Kubernetes Audit Log Analysis: What to Log, How to Query, and What to Alert On
Kubernetes audit logs record every request to the API server: who made the request, what they asked for, and whether it succeeded.
Securing Model Artifact Pipelines: From Training to Serving
Model files are opaque binaries ranging from 1GB to over 1TB. You cannot code-review a set of weights.
RLHF Data Protection: Securing Human Feedback Loops, Preference Data, and Reward Models
Reinforcement Learning from Human Feedback (RLHF) pipelines introduce unique security surfaces that standard ML training workflows do not have.
AI API Key Management: Rotation, Scoping, and Abuse Detection
AI services have turned API keys into direct spending controls. A leaked OpenAI or Anthropic key can generate thousands of dollars in charges within...
Prompt Injection Defence in Production: Input Validation, Output Filtering, and Monitoring
Prompt injection is the SQL injection of AI systems, the most common and most damaging attack class against LLM-powered applications.
Network Segmentation for AI Training Infrastructure
AI training clusters frequently share networks with production services. A training job that can reach the production database is one compromised...
Observability for LLM Applications: Token Usage, Latency Anomalies, and Output Classification
LLM-powered applications have unique observability requirements that standard APM tools do not address: token-based cost tracking (not just request...
Model Registry Access Control: Versioning, Signing, and Promotion Gates
Model registries are the bridge between training and production. A model pushed to the production registry gets served to users.
Kubernetes Service Account Token Security: Bound Tokens, Projected Volumes, and OIDC
Every pod in Kubernetes receives a service account token by default. In clusters running older configurations or without explicit hardening, these...