Observability & Detection Articles

Security observability guides covering logging, Prometheus metrics, Falco, detection rules, incident response, forensics, and monitoring.

Security Observability and Detection Guides

Intermediate 14 min read

Detecting Anomalous PR Patterns in OSS Projects via GitHub Audit Logs

GitHub's audit log API exposes contributor behaviour at a level of detail that makes it possible to detect the slow-burn account-preparation patterns used in supply chain attacks: unusual review approvals, sudden privilege escalation, bulk PR merges outside working hours. This guide covers building a detection pipeline over the GitHub audit log for OSS project security monitoring.

Advanced 15 min read

Behavioral Detection for Active CVE Exploitation in Production

When patches haven't landed yet, behavioral signals are the last line of defence. eBPF-based syscall fingerprinting and SIEM correlation rules detect active exploitation of known-unpatched CVEs by matching the specific behavioral pattern of each exploit class — not just generic anomalies.

Intermediate 13 min read

Grafana Alloy Security Hardening: Protecting the OTel Collector Distribution

Grafana Alloy (GA 2024) replaces Grafana Agent as the OpenTelemetry Collector distribution. Its River/Alloy configuration language, remote configuration capability, and broad data-pipeline access create attack surfaces absent from simpler agents. This guide covers authentication hardening, pipeline isolation, and sensitive-data scrubbing.

intermediate 13 min read

Prometheus Operator RBAC: Cluster-Wide Secret Access via ServiceMonitor

The default Prometheus Operator RBAC grants Prometheus cluster-wide read access to Secrets; a compromised Prometheus instance or a crafted ServiceMonitor can exfiltrate every secret in the cluster through legitimate monitoring scrape operations — scope operator permissions to the minimum required.

intermediate 13 min read

Integrating CISA KEV into Your SIEM for Real-Time Exploitation Alerts

CISA's Known Exploited Vulnerabilities catalog is updated when CVEs are confirmed to be actively exploited; ingesting KEV additions as real-time SIEM events and cross-referencing them against your asset inventory generates immediate escalation for the CVEs that matter most.

intermediate 13 min read

Detecting NGINX CVE Exploitation via Logs and Runtime Signatures

NGINX CVEs leave patterns in access logs, error logs, and system call behaviour; Suricata network signatures and Falco runtime rules detect active exploitation of mp4 module heap overflows, QUIC module crashes, and ingress annotation injection before the attacker pivots.

intermediate 13 min read

Defending Prometheus Against High-Cardinality Label Injection and DoS

Attackers with access to metric write endpoints can inject high-cardinality label values to exhaust Prometheus memory and cause OOM kills; enforce cardinality limits, authenticate remote-write endpoints, and alert on metric explosion before it takes down your monitoring stack.

intermediate 13 min read

Safe AI-Assisted Security Alert Triage and Escalation

LLMs triaging security alert queues can suppress genuine incidents via hallucination or adversarial alert content; build safe triage with hard escalation overrides, adversarial-input guards, confidence thresholds, and mandatory human review for high-severity classifications.

intermediate 14 min read

Kubernetes Network Flow Security Monitoring with Cilium Hubble and Retina

eBPF-based network flow visibility tools — Cilium Hubble, Microsoft Retina, and custom XDP programs — expose Kubernetes lateral movement, data exfiltration, and policy bypass in real time; configure flow-level alerting and long-term retention for threat hunting.

intermediate 14 min read

AI-Assisted Threat Hunting: LLMs in the Security Operations Workflow

LLMs accelerate analyst investigation by translating natural-language hypotheses into detection queries, summarising alert context, and surfacing lateral movement patterns across high-volume log data; integrate them safely without introducing hallucination-driven false negatives.

intermediate 14 min read

Detecting and Preventing Cloud Audit Log Tampering

Attackers with compromised IAM credentials routinely disable CloudTrail, delete log groups, or modify log export destinations before conducting lateral movement; implement immutable WORM log archival, cross-account monitoring, and real-time tampering alerts.

intermediate 11 min read

Detecting Developer Credential Harvesting: Monitoring .npmrc, .pypirc, and Cloud Config Files

PamDOORa and Quasar Linux RAT — post-exploitation toolkits active in May 2026 — harvest credentials from developer configuration files: .npmrc (npm tokens), .pypirc (PyPI passwords), .git-credentials (Git tokens), ~/.aws/credentials, ~/.config/gcloud, and ~/.kube/config. This article covers eBPF-based monitoring of these file access patterns with Tetragon and Falco, alerting on anomalous reads, and hardening developer environments against credential harvesting.

advanced 15 min read

Detecting and Containing eBPF-Based Rootkits That Blind Your Observability Stack

eBPF rootkits can hook kernel functions to hide processes, filter telemetry before it reaches Falco or Tetragon, and evade EDR; detect them via BPF map inspection, kernel integrity cross-checks, and observability-layer redundancy.

Advanced 13 min read

API Threat Detection via Traffic Analysis: Detecting BOLA, Enumeration, and Mass Assignment in Access Logs

BOLA attacks look like normal authenticated requests — the only signal is that one user is accessing many different object IDs in sequence. Enumeration attacks look like elevated 404 rates from a single source. Mass assignment looks like a PATCH request with unexpected fields. Structured access logs with object ID tracking, status code distributions, and request body field analysis reveal all three without application-level instrumentation.

intermediate 11 min read

Container Patch Compliance Observability: Tracking CVE-to-Patch SLAs Across a Fleet

Knowing that Copa patched an image once is not the same as knowing every production container is currently below the critical CVE threshold. Patch compliance observability requires continuous tracking of image vulnerability age, patch run outcomes, SLA breach detection, and Grafana dashboards that give security teams a real-time view of fleet exposure. This article covers the metrics, exporters, and alerting architecture for container patch compliance at scale.

intermediate 12 min read

ContainerSSH Audit Logging: Session Recording, S3 Export, and SIEM Integration

ContainerSSH records every SSH session as a structured audit log — keystrokes, commands, and output — and can export session recordings to S3 in asciicast format for forensic replay. This article covers ContainerSSH's audit logging pipeline, shipping session recordings to a SIEM, writing detection rules for anomalous session behaviour, and using session recordings for incident response.

advanced 13 min read

Detecting Copy-on-Write Exploitation with eBPF: Tracing Dirty Pipe and Overlayfs Attack Patterns

Copy-on-write exploits — dirty pipe, dirty COW, overlayfs copy-up races — share a common behavioural signature: a process writes to a page-cache page it should only be able to read, or gains file capabilities it should not have. eBPF tracing programs can detect these patterns at the syscall and VFS layer before privilege escalation completes. This article covers Tetragon and Falco policies for detecting CoW exploitation attempts in real time.

Advanced 14 min read

Kubernetes Forensics After Compromise: Reconstructing the Attack Timeline

Kubernetes evidence is ephemeral by design — pods are deleted, logs are overwritten, containers are rebuilt. A forensic investigation needs to know: what survives pod deletion, where the Kubernetes API server audit log is stored, what etcd snapshots contain, and how to reconstruct the timeline of an attack from node filesystem artifacts, API server events, and container runtime logs.

Advanced 14 min read

OpenTelemetry Collector Hardening: Pipeline Injection, RBAC, and Securing the Observability Data Path

The OTel Collector receives telemetry from every service in the cluster — an attacker who controls the collector controls all observability data. Log injection via crafted spans, metric manipulation to hide malicious activity, and configuration injection via the pprof/health endpoints are real attack vectors. This article hardens the collector's receivers, processors, exporters, and management endpoints.

Advanced 13 min read

Detecting Secret Access Anomalies: Vault and AWS Secrets Manager Audit Log Analysis

Vault and AWS Secrets Manager both produce structured audit logs. Normal secret access follows predictable patterns: specific applications read specific secrets at predictable intervals. Anomalies — bulk reads, access from unexpected IPs, secrets read but application not restarted, rotation events without matching deployment events — reveal compromise or misconfiguration before credentials are used externally.

Advanced 14 min read

Detecting LLM-Driven Bots Through Observability: Signals That Survive AI Mimicry

Standard bot detection — mouse movement, typing cadence, session replay heuristics — fails against LLM-driven agents that generate statistically humanlike behaviour. Seven detection signals derived from server-side observability survive AI mimicry: API call graph topology, resource fetch completeness, semantic request coherence, timing variance under load, DNS pre-resolution patterns, WebSocket heartbeat regularity, and server-push utilisation.

Advanced 13 min read

AI-Fabricated Log Evidence: Defending Forensic Pipelines Against LLM-Generated Log Forgery

LLMs can generate statistically plausible log entries that match the style, timing, and content of a real application's log stream. An attacker with post-compromise write access to logs can backfill plausible cover-traffic, forge authentication events, or erase evidence by substituting fabricated entries. SIEM pipelines that trust log content need cryptographic integrity proofs.

Intermediate 13 min read

AI-Generated Monitoring vs. Open Source Observability Standards: The Ecosystem Argument

An LLM can write a Prometheus exporter, a Fluent Bit parser, or an OpenTelemetry instrumentation library in minutes. The result works today. In 18 months it is unmaintained, incompatible with current Prometheus scraping changes, not integrated with the OpenTelemetry semantic conventions update, and has no vendor interoperability. The value of open source observability is the ecosystem contract, not the code.

Advanced 14 min read

eBPF Verifier Bugs: Privilege Escalation from Container Observability Tools

CVE-2021-3490 (ALU32 bounds bypass) and CVE-2022-23222 (pointer arithmetic escape) both allowed unprivileged eBPF programs to achieve kernel write primitives. Observability tools like Falco, Tetragon, and Pixie that load eBPF programs into the kernel expand the attack surface — a compromised tool or malicious pod with BPF privileges can escalate to host root.

intermediate 13 min read

Frontend RUM Security: Grafana Faro, Session Replay, and Browser Telemetry

Hardening browser-side RUM and session-replay pipelines: PII scrubbing, supply-chain integrity, sampling controls, and detection for hostile telemetry.

Advanced 13 min read

Detecting Harvest-Now-Decrypt-Later: Monitoring for Quantum-Era Adversary Collection

Nation-state adversaries are actively recording encrypted traffic today for future quantum decryption. HNDL attacks are detectable through anomalous network tap placement, bulk TLS session recording patterns, and unusual data volume exfiltration. This guide covers HNDL threat indicators, network monitoring for bulk collection behaviour, and using PQC adoption as a detection tripwire.

Intermediate 12 min read

Auditing MCP Tool Calls: Building the Forensic Trail for Agent Actions

When an AI agent reads a sensitive file, executes a database query, or calls an external API via MCP, that action is invisible to traditional audit systems — it appears as normal process I/O, not as a distinct auditable event. Structured MCP tool call logging, parameter capture, and result hashing give incident responders the trail they need to reconstruct what an agent did and why.

Intermediate 13 min read

Security Issues in Observability Tooling: Reporting Vulnerabilities in Prometheus, Grafana, and Elasticsearch

Observability tools store security-sensitive data — logs containing credentials, metrics revealing system behaviour, traces with PII. Vulnerabilities in Prometheus, Grafana, Elasticsearch, and Loki can expose this data or provide a pivot into the infrastructure they monitor. This guide covers the security disclosure processes for major observability projects, how to report vulnerabilities, and how to respond as a consumer.

intermediate 14 min read

OpenTelemetry Profiles Signal Security: PII Leakage, Access Control, and Symbolisation Pipelines

OTel Profiles is the fourth signal alongside traces, metrics, and logs — stable as of 2025 and now flowing through the OTel Collector by default. Stack frames carry function names, file paths, and sometimes full SQL or cleartext URLs. Hardening guide for collector pipelines and storage.

Advanced 13 min read

perf_event_open and Kernel Profiling as an Attack Surface: CVE-2023-2235 and Hardening Paranoid Mode

The Linux perf_event_open() syscall — used by perf, pprof, py-spy, async-profiler, and Datadog APM — has produced a stream of local privilege escalation CVEs. CVE-2023-2235 (use-after-free in perf_group_detach) required only perf_event_paranoid <= 1 to achieve kernel code execution. The tradeoff between profiling capability and kernel attack surface is controlled by a single sysctl.

Advanced 13 min read

Correlating SAST Findings with Runtime Behaviour: Prioritising Reachable Vulnerabilities

SAST tools report thousands of findings — but most are in code paths that are never executed in production. Correlating static findings with runtime traces, error rates, and WAF telemetry identifies which vulnerabilities are in hot code paths, which are reachable from the internet, and which can be de-prioritised. This guide builds a SAST-to-runtime correlation pipeline using OpenTelemetry, distributed tracing, and SARIF metadata.

Advanced 13 min read

Security Observability for AI Inference Infrastructure: Monitoring Prompt Injection, Model Abuse, and Inference Threats

AI inference endpoints are APIs with unusually high blast-radius inputs: a single prompt can exfiltrate training data, bypass all downstream application logic, or drain budget at scale. This article builds a security observability layer specifically for LLM inference — logging the right signals, detecting prompt injection and jailbreaks, identifying model extraction attempts, and applying OpenTelemetry GenAI semantic conventions without creating a PII logging catastrophe.

Intermediate 11 min read

Alertmanager Receiver Security: SSRF, API Hardening, and Alert Pipeline Integrity

Alertmanager webhook receivers can be weaponised for SSRF if an attacker modifies the configuration. Harden the admin API with authentication, restrict receiver URLs to an allowlist, and protect the alert pipeline from pre-attack blind spot creation.

Intermediate 12 min read

API Traffic Security Observability: Monitoring API Behaviour for Security Threats

API gateways aggregate traffic statistics, but security threats live in per-caller behaviour over time: brute-force patterns across auth failures, scanning behaviour in parameter variation, data dump signatures in response sizes. This article builds a security observability layer on top of API traffic using OpenTelemetry, Prometheus, and Elasticsearch to surface what gateway dashboards hide.

Intermediate 11 min read

Cloud Cost Anomaly Detection as a Security Signal: Crypto Mining and Unauthorized Compute

Cost spikes are often the earliest observable indicator of a cloud compromise. Learn how to configure AWS, GCP, and Azure cost anomaly detection, correlate billing signals with security events, and automate quarantine responses.

Advanced 13 min read

Container Memory Forensics for Incident Response

Malware lives in memory only, credentials sit decrypted in heap, C2 implants leave no files on disk. This guide covers capturing and analysing container process memory without losing evidence — using /proc, gcore, CRIU checkpoints, and Volatility 3.

Advanced 12 min read

Security Considerations for Continuous Profiling with Parca and Pyroscope

Understand the kernel attack surface, privilege model, and data sensitivity risks of eBPF-based continuous profiling with Parca and Grafana Pyroscope, and harden deployments against each threat.

Advanced 13 min read

Detecting Credential Access Attempts: Log Analysis and Runtime Monitoring

Attackers steal credentials before they steal data. This article shows how to instrument auditd, Falco, Kubernetes audit logs, and CloudTrail to detect OS credential dumping, brute force, credential stuffing, and cloud IAM abuse before they lead to a breach.

Advanced 13 min read

Detecting Data Exfiltration Through Log Analysis and Network Monitoring

Attackers who reach your data will use HTTP/S, DNS tunnelling, ICMP, cloud storage, and email to move it out. This article builds a layered detection stack: volumetric alerts on VPC flow logs, covert channel detection via Zeek and Elasticsearch, Falco rules for staging behaviour, cloud DLP integration, and a high-confidence correlation rule that combines internal staging with external transfer.

Intermediate 12 min read

Database Activity Monitoring: Audit Logs, SQL Inspection, and SIEM Integration

Application logs tell you what the API did. Database audit logs tell you what actually happened to the data. Learn how to configure pgaudit, MySQL audit plugins, MongoDB auditing, and Redis monitoring to detect SQL injection, privilege escalation, and exfiltration at the data layer.

Intermediate 12 min read

Datadog Security Configuration Hardening

The Datadog Agent runs with broad system access by default — reading all container logs, hooking the kernel for APM, and transmitting data to Datadog's intake. Hardening covers Agent privilege reduction, API and app key management, RBAC scoping, sensitive data scrubbing, network configuration, and Datadog's own CSPM and audit trail features.

Advanced 12 min read

Detecting AI-Automated Container Escapes with Runtime Monitoring

LLMs escaping containers show distinct patterns: systematic /proc enumeration, rapid sequential exploit attempts, and methodical attack chain progression. Build Falco rules and eBPF detection tuned for AI attack signatures rather than just human-paced intrusion patterns.

Advanced 13 min read

Falco Runtime Security: Writing Effective Detection Rules and Deploying Falco Securely

Falco is the de facto standard for Linux runtime security monitoring. This guide covers its syscall-based detection model, writing custom rules for privilege escalation, container escapes, and credential access, tuning rules to eliminate false positives, securing falco.yaml, routing alerts through Falcosidekick, and automating response with Falco Talon.

Intermediate 12 min read

File Integrity Monitoring with Falco and AIDE: Detecting Unauthorized File Changes

Deploy a layered file integrity monitoring strategy using AIDE for baseline integrity checks and Falco for real-time detection. Covers AIDE configuration, database initialization, scheduled checks, SIEM integration, Falco fanotify rules for /etc/ and /usr/bin/ writes, combining both tools, Wazuh syscheck as a managed alternative, and handling legitimate change windows.

Intermediate 12 min read

Fluent Bit Security Hardening: Securing Log Collection Pipelines in Kubernetes

Fluent Bit runs as a privileged DaemonSet that reads every pod log on every node. A misconfigured Fluent Bit deployment leaks PII, ships logs to the wrong destination, and provides an exfiltration vector. Harden RBAC, mTLS output, PII scrubbing, and routing controls before attackers reach your log pipeline.

Intermediate 12 min read

Kubernetes Events for Security: Detecting Threats Beyond the Audit Log

Kubernetes events surface OOMKilled pods, image pull failures, CrashLoopBackOff cycles, and node pressure before an attacker's activity reaches audit logs — here's how to collect, ship, and alert on them.

Intermediate 11 min read

Log Retention Policy, Archival Security, and Compliance-Driven Log Management

Regulatory frameworks disagree on how long logs must be kept, but they all agree logs must be tamper-evident and access-controlled. This guide covers tiered retention design, WORM archival with S3 Object Lock, Elasticsearch ILM, GDPR right-to-erasure tensions, and cost-optimised cold storage for PCI DSS, SOC 2, HIPAA, and GDPR compliance.

Intermediate 12 min read

mTLS Observability: Monitoring Certificate Health, Detecting Misconfigurations, and Alerting on TLS Failures

When mTLS is misconfigured, traffic silently falls back to plaintext or fails — with no visible error unless you have the right metrics. This guide covers the key signals to track: handshake failure rates, certificate expiry, plaintext traffic detection, Istio and Linkerd mTLS coverage metrics, and SPIFFE SVID rotation health.

Advanced 14 min read

Real-Time Payment Fraud Detection: Velocity Rules, Device Signals, and Behavioral Baselines

Payment fraud detection requires sub-second decisions combining transaction velocity, device fingerprinting, geolocation consistency, and behavioral baselines. This guide covers building a layered fraud detection system with rule-based velocity checks, ML-based anomaly scoring, and streaming analytics — applicable to card payments, ACH transfers, and Open Banking transactions.

Advanced 12 min read

Process Tree Security Analysis: Detecting Attacks Through Process Lineage

Individual process events look normal in isolation. Process lineage exposes the attack: nginx spawning bash spawning curl is a web shell, not routine activity. This article covers eBPF-based parent tracking, Falco rules, osquery lineage queries, Elasticsearch aggregations, and specific detection patterns for web shells, reverse shells, credential dumping, and container escapes.

Advanced 13 min read

Runtime Application Self-Protection (RASP): In-Process Security Monitoring and Blocking

RASP instruments the application runtime itself — JVM agents, Python function hooks, Go middleware — giving it full execution context to detect and block SQL injection, command injection, and path traversal at the exact point they occur, not at the network perimeter. This article covers how RASP works, open-source and commercial options, implementing lightweight Python and Java RASP, performance trade-offs, and how RASP fits as a defence-in-depth layer alongside input validation and WAFs.

Advanced 13 min read

Advanced Security Event Correlation: EQL Sequences, Entity Graphs, and Automated Response

Single-event SIGMA rules miss multi-stage attacks where every individual event looks benign. EQL sequence detection, graph-based entity correlation, and temporal pattern analysis close this gap — turning scattered low-confidence signals into high-confidence attack-chain alerts.

Intermediate 12 min read

Security SLIs and Error Budgets: Measuring Posture with SRE Discipline

Apply SRE error-budget discipline to security posture: define SLIs for mTLS coverage, vulnerability scan pass rates, secret rotation, patch SLA, and MTTD. Set realistic SLOs, implement multi-window burn-rate alerts in Prometheus, and use budget depletion to trigger security sprints.

Intermediate 12 min read

Serverless Security Observability: AWS Lambda, GCP Cloud Functions, Azure Functions

Serverless and FaaS workloads present unique security observability challenges: no persistent agents, ephemeral execution environments, and platform-managed runtimes with limited introspection. This article covers structured security logging, abuse detection, layer integrity, secret management, VPC controls, and exfiltration detection for AWS Lambda, GCP Cloud Functions, and Azure Functions.

Intermediate 12 min read

Splunk Security Hardening: Authentication, RBAC, TLS, and Audit Logging

Splunk ingests every security log in your environment — compromising it gives an attacker a complete map of your defenses and an erasure tool for the audit trail. This guide covers SAML/LDAP authentication, role-based access control, TLS hardening for forwarder-to-indexer traffic, audit logging, and protecting the splunk.secret file.

Intermediate 11 min read

Synthetic Monitoring as a Security Tool: Blackbox Exporter, Certificate Probes, and Tamper Detection

Prometheus Blackbox Exporter probes external endpoints continuously — making it a powerful early-warning system for TLS certificate expiry, TLS downgrade attacks, content tampering, DNS hijacking, and missing security headers, weeks before users are affected.

Intermediate 12 min read

Securing Distributed Tracing Infrastructure: Grafana Tempo and Jaeger

Distributed traces are a security liability by default — they accumulate request parameters, user IDs, internal service URLs, and raw SQL across every hop of every request. This guide hardens the full tracing stack: PII scrubbing before storage, Tempo authentication and multi-tenancy, S3 backend encryption, Jaeger access control, OTLP endpoint authentication, and the right-to-erasure problem in append-only trace storage.

Advanced 13 min read

Securing Multi-Tenant Prometheus Deployments with Thanos

Single Prometheus instances per cluster give every tenant shared access to every metric with no isolation, no long-term retention controls, and no cross-cluster query security. Thanos solves the scaling problem but introduces its own attack surface: exposed gRPC endpoints, cross-tenant query leakage, object storage misconfigurations, and PII in time-series labels. This guide hardens every Thanos component.

Advanced 13 min read

User Behavior Analytics: Detecting Insider Threats and Compromised Accounts

Signature-based detection misses insider threats and compromised credentials entirely. UBA builds behavioral baselines per user and entity, then surfaces deviations — off-hours access, bulk downloads, impossible travel — as risk scores that trigger investigation before damage is done.

Intermediate 12 min read

VictoriaMetrics Security Hardening: Authentication, TLS, Tenant Isolation, and Data Protection

VictoriaMetrics is a high-performance Prometheus-compatible TSDB with no built-in authentication. Without vmauth, anyone who reaches any component endpoint reads or writes all metrics. This guide hardens every layer: vmauth proxy authentication, per-component TLS, vmgateway JWT tenant isolation, vmagent credential management, deleteRange API access control, and backup encryption.

Intermediate 10 min read

Grafana Datasource Auth Bypass: CVE-2026-27880 and HTTP Path Normalisation

CVE-2026-27880 lets Grafana Viewers bypass datasource access controls with a double slash in the API path. Patch to fixed versions, enforce datasource permissions, and understand the HTTP path normalisation class of auth bypass vulnerabilities.

Advanced 11 min read

OTel Collector Remote Configuration Security: Hardening the OpAMP Trust Boundary

OpAMP lets a central server push arbitrary pipeline configs to OTel Collectors. An attacker with OpAMP server access can redirect all telemetry to their endpoint or disable security alert pipelines. Harden the OpAMP trust boundary with mTLS, config signing, and change alerting.

Intermediate 11 min read

SBOM-Driven Supply Chain Compromise Detection: Finding Axios 1.14.1 in Production

After the Axios compromise, organisations needed to know if 1.14.1 was running in production. SBOMs attached to container images as OCI attestations make this a seconds-long query. Build a continuous SBOM monitor that alerts when IOC packages appear in deployed workloads.

Advanced 12 min read

Grafana Plugin Trust and RCE: The CVE-2026-27876 Attack Chain

CVE-2026-27876 chains a SQL expressions file-write with Grafana's enterprise plugin loader to achieve RCE from Viewer access. Understand the delayed-disclosure pattern and how to harden plugin trust, feature toggles, and filesystem permissions.

Advanced 12 min read

Runtime Detection of npm Supply Chain RAT Behaviour: Observing the Axios Attack Pattern

The Axios RAT executed, phoned home, and erased its traces within seconds of npm install. Build runtime detection across process tree monitoring, network telemetry, and file system events — and a Sigma rule for the Axios IOC pattern.

Advanced 12 min read

OT Incident Response and Forensics: CISA's ICS Evidence Guidance

CISA's OT Zero Trust guidance covers pre-crisis decision matrices and MITRE ATT&CK for ICS playbooks. Learn what to preserve from PLCs and HMIs before power cycling, how to structure OT IR playbooks, and how to build forensic readiness into air-gapped OT networks.

Intermediate 12 min read

OT Network Monitoring with CISA Malcolm: Visibility for ICS/SCADA

CISA's OT Zero Trust guidance recommends Malcolm for OT network traffic analysis. Deploy Zeek-based passive monitoring with Modbus and DNP3 parsers, build behavioral baselines, and implement specification-based detection for process variable anomalies.

intermediate 15 min read

OpenTelemetry Language SDK Security

Harden OpenTelemetry language SDKs against CVE-2026-40182 unbounded memory DoS in the OTLP exporter and CVE-2026-40891 gRPC trailer parsing DoS—and track silent fixes in fast-moving SDK releases.

advanced 16 min read

Wazuh Cluster Security Hardening

Harden Wazuh against CVE-2026-30893 cluster path traversal RCE (CVSS 9.0) and CVE-2026-25769 deserialization RCE, with monitoring for Wazuh's coordinated disclosure patterns.

advanced 15 min read

Grafana Beyla eBPF Auto-Instrumentation Security

Harden Grafana Beyla deployments by scoping eBPF privileges, restricting process visibility, preventing telemetry data leakage, and controlling network-level instrumentation scope.

advanced 15 min read

Grafana SQL Expressions and Plugin RCE Hardening

Harden Grafana deployments against CVE-2026-27876-class RCE via SQL expressions and Enterprise plugins by controlling feature toggles, plugin permissions, and monitoring silent Grafana security releases.

intermediate 15 min read

Graylog Security Hardening

Harden Graylog log management against CVE-2026-1435 session fixation (CVSS 9.1), CVE-2026-1436 IDOR, and the 7-CVE April-May 2026 batch—with Graylog's advisory monitoring patterns.

intermediate 14 min read

OpenTelemetry Tail-Based Sampling for Security-Critical Traces

Configure OpenTelemetry Collector tail-based sampling to guarantee retention of security-relevant spans while controlling volume, and track OTel Collector CVEs from public PRs.

intermediate 15 min read

Prometheus Remote Write and Config Endpoint Security

Harden Prometheus against CVE-2026-42151 OAuth credential exposure via /-/config, CVE-2026-42154 stored XSS, and the recurring pattern of security fixes shipped in routine Prometheus releases.

advanced 15 min read

Vector Log Pipeline Security

Harden Vector log collection pipelines against Lua transform code execution, source input injection, credential exposure, and silent security fixes in Vector's Datadog-driven release process.

intermediate 12 min read

Prometheus Alertmanager Security: Receiver Credentials, Silencing Controls, and Inhibition Rules

Alertmanager routes security alerts to PagerDuty, Slack, and email. Exposed receiver credentials, unauthenticated silence APIs, and overly broad inhibition rules can suppress legitimate security alerts — exactly what an attacker wants. Hardening Alertmanager protects the alerting pipeline itself.

intermediate 14 min read

Continuous Profiling Security with Parca and Pyroscope

Protect sensitive call-stack and memory data collected by eBPF-based continuous profilers (Parca, Pyroscope) with access control, PII scrubbing, and retention limits.

intermediate 12 min read

Distributed Tracing Security: Jaeger, Tempo, and Sensitive Span Data Scrubbing

Distributed traces capture the full execution path of a request across services — including HTTP headers, query parameters, and error payloads that may contain PII, authentication tokens, or internal system details. Securing the tracing pipeline requires data scrubbing at collection, access controls on trace storage, and sampling policies that limit exposure.

intermediate 13 min read

Elasticsearch Security Hardening: TLS, Role-Based Access, and Audit Logging

Elasticsearch clusters exposed without authentication have been the source of hundreds of data breaches. Enabling TLS between nodes and clients, configuring role-based access control, and enabling audit logging closes the most common attack vectors against ELK and EFK stacks.

intermediate 12 min read

Grafana Security Hardening: Authentication, RBAC, and Data Source Permissions

Grafana dashboards expose infrastructure metrics, logs, and traces — often including sensitive operational data. Hardening authentication, restricting data source access by team, disabling anonymous access, and auditing snapshot sharing prevents data exposure.

intermediate 12 min read

Loki Security Hardening: Authentication, Tenant Isolation, and Log Tampering Prevention

Loki aggregates logs from all services. Without authentication, anyone who reaches the Loki endpoint reads all logs. Multi-tenancy requires strict tenant isolation, rate limiting per tenant, and append-only storage to prevent log tampering.

intermediate 13 min read

Application Security Logging: Structured Events, PII Redaction, and SIEM Integration

Application logs are the primary source of authentication, authorisation, and API activity signals. Most applications log too little for security, or too much PII. Structured security events fix both.

intermediate 13 min read

Cloud Provider Audit Logs: CloudTrail, GCP Audit Logs, and Azure Monitor Hardening

Cloud audit logs are your primary evidence source for privilege escalation, data exfiltration, and lateral movement at the cloud control plane. They require active hardening to be tamper-proof and queryable.

intermediate 13 min read

Network Flow Analysis: NetFlow, IPFIX, and eBPF for Traffic Anomaly Detection

Flow records capture who talked to whom, when, and how much — without packet payload. They detect C2 beaconing, lateral movement, data exfiltration, and port scanning that signature-based tools miss.

intermediate 13 min read

Security Chaos Engineering: Testing Detection and Response Capabilities

If you haven't tested that your detection rules fire and alerts route correctly, you don't know if they work. Security chaos engineering injects controlled attacks to validate the detection stack before a real attacker does.

intermediate 13 min read

Alert Deduplication and Correlation Patterns: Beating Alert Fatigue at Scale

Per-rule grouping and fingerprint-based dedup get you from 10,000 alerts/day to 200. Correlation across signals is the next jump — to 30 actionable incidents.

intermediate 14 min read

Forensic Readiness: Log Retention, Capture, and Chain of Custody for Incident Response

What you don't capture, you can't investigate. Forensic readiness is the discipline of designing the logging layer so post-incident you have what you need.

intermediate 13 min read

Honeypot and Deception Technology in Kubernetes: Canary Tokens, Fake Credentials, and Honeypod Pods

Deception detects attackers who evade signature-based controls by placing fake credentials, canary tokens, and honeypot services that trigger high-confidence alerts on access.

intermediate 14 min read

Security SLOs and Error Budgets: SRE Discipline Applied to Detection and Response

Treat security as a service: define SLIs (detection coverage, MTTD), set SLOs, track burn rate. The same discipline that makes reliability measurable makes security measurable.

intermediate 13 min read

Threat Hunting with Osquery: Fleet Queries, Detection Packs, and IOC Sweeps

Osquery turns your fleet into a queryable database. Scheduled queries surface persistence mechanisms, lateral movement artefacts, and IOCs across thousands of hosts simultaneously.

intermediate 14 min read

Detection Engineering Metrics: MTTD, MTTR, Signal-to-Noise, and Coverage Tracking

If you cannot measure your detection program, you cannot improve it. The metrics that matter, how to compute them, and what they trigger when they shift.

intermediate 14 min read

OpenTelemetry PII Leakage: Stopping Sensitive Data in Span Attributes, Baggage, and Logs

OTel traces capture authorization headers, URL params, internal IDs, and database query strings by default. Without redaction, your traces are an exfiltration target.

intermediate 14 min read

SIEM Cost Optimization: Cardinality, Retention, Sampling, and Index-Tier Strategy

SIEM bills double yearly because nobody owns the spend. Cardinality control, retention tiering, and sampling reduce cost 40-70% without losing detection.

intermediate 15 min read

Detection-as-Code with Sigma: Versioned, Tested, Vendor-Neutral SIEM Rules

Detection logic scattered across SIEM consoles and shell scripts does not scale. Sigma rules in Git, tested in CI, converted to any backend on deploy, do.

intermediate 14 min read

Security Dashboards That Engineers Actually Use: Grafana Designs for Hardening Verification

Most security dashboards are vanity metrics, total alerts this month, pie charts of vulnerability severity, traffic heatmaps that look impressive but.

advanced 16 min read

OpenTelemetry for Security: Distributed Tracing of Authentication and Authorization Flows

Distributed tracing is standard for performance debugging, but almost no team uses it for security.

intermediate 18 min read

OpenTelemetry Collector Pipelines: Securing Receivers, Processors, and Exporters

An OTel Collector pipeline with default settings forwards every attribute, header, and trace to your backend with no filtering or authentication.

advanced 18 min read

Lateral Movement Detection: Network Patterns, Authentication Anomalies, and Alert Correlation

East-west traffic inside a Kubernetes cluster is a blind spot for most security teams.

intermediate 18 min read

Security-Relevant Prometheus Metrics: What to Collect, How to Alert, When to Page

Prometheus is deployed in most Kubernetes environments for infrastructure monitoring (CPU, memory, disk, request latency.

advanced 18 min read

eBPF-Based Security Monitoring: Tetragon for Process, Network, and File Observability

Falco monitors syscalls for runtime detection. Tetragon (CNCF/Cilium) goes deeper: it monitors process execution, network connections, and file...

advanced 16 min read

Log Integrity and Tamper Detection: Ensuring Your Audit Trail Is Trustworthy

An attacker's first post-compromise action is covering their tracks. On a Linux host, this means deleting /var/log/audit/audit.log, clearing journal..

advanced 18 min read

Container Escape Detection: Runtime Signals, Kernel Indicators, and Response Automation

Container escapes are the highest-impact attack in Kubernetes. A single compromised pod that escapes its container gains access to the underlying...

advanced 16 min read

Kubernetes Audit Log Pipeline Design: From API Server to SIEM

Kubernetes audit logging at the RequestResponse level captures everything: every API call, every request body, every response payload.

intermediate 15 min read

Crypto Mining Detection: CPU Patterns, Network Signatures, and Automated Response

Cryptojacking is the most common post-compromise activity in Kubernetes environments.

advanced 18 min read

Building Detection Rules That Don't Cry Wolf: Alert Design for Security Events

Security detection that generates 50+ false positives per day is worse than no detection, it trains the team to ignore alerts.

intermediate 15 min read

Certificate Expiry Monitoring: Automated Detection Across TLS, mTLS, and Signing Certificates

Certificate expiry is the most common cause of preventable production outages. When a TLS certificate expires, HTTPS connections fail, mTLS...

intermediate 17 min read

Incident Response Runbooks: Structured Procedures for Common Security Events

Detection without documented response is security theatre. Most teams have alerts that fire at 3 AM, but no written procedure for what the on-call...

intermediate 20 min read

Centralized Logging Architecture for Security: Fluentd, Vector, and Loki Compared

Self-managed log infrastructure is one of the highest operational costs for small-to-medium teams.

advanced 22 min read

Building a Security Audit Log Pipeline That Scales: auditd to Elasticsearch

Linux audit logs are the ground truth for security investigation. auditd captures kernel-level events that no userspace tool can see: file access by...