Defending Prometheus Against High-Cardinality Label Injection and DoS

Defending Prometheus Against High-Cardinality Label Injection and DoS

Problem

Prometheus stores all time series in memory. Each unique combination of metric name and label values constitutes a separate time series. A metric http_requests_total{method="GET", path="/api/users", status="200"} and http_requests_total{method="GET", path="/api/users/12345", status="200"} are two different time series — and if the path includes arbitrary user IDs, there are as many time series as there are unique user IDs.

This is the high-cardinality problem, and it is typically encountered as an accidental configuration mistake. But it is also an intentional attack vector: any attacker who can influence the values that appear in metric labels — or who has write access to a Prometheus remote-write or pushgateway endpoint — can deliberately create unbounded numbers of time series, exhausting Prometheus memory until the process OOMs and the entire monitoring stack goes dark.

The attack surface is broader than commonly understood:

Pushgateway exposure. The Prometheus Pushgateway has no authentication by default. Any process that can reach it can push arbitrary metrics with arbitrary label values. In a Kubernetes cluster where pods have network access to the monitoring namespace (common without NetworkPolicy), any compromised pod can write metrics with 10,000 unique label combinations, each allocating memory in Prometheus.

Remote-write relay injection. Prometheus remote-write ingestion endpoints (Thanos Receiver, Grafana Mimir, Cortex) typically accept metrics without label cardinality enforcement. An attacker who can send HTTP POST requests to these endpoints can inject arbitrary time series. Many remote-write endpoints are protected only by network ACLs, not authentication.

Instrumentation code injection. In applications that create metric labels from user input (HTTP request paths, usernames, tenant identifiers), a high-cardinality attack can be triggered by sending requests with many unique values — without any direct access to the metrics infrastructure. The application faithfully creates a new time series for each unique label value.

Alertmanager and recording rule cascades. High cardinality doesn’t just affect Prometheus memory — it propagates through the stack. Recording rules that process high-cardinality metrics produce many output series. Alertmanager receives many alerts. The evaluation loop slows down. The entire observability stack degrades simultaneously.

The business impact is significant: the monitoring system goes offline exactly when it is needed most — during or after an incident. If the attack coincides with an active breach, the attacker has blinded the defenders’ primary visibility tool.

Target systems: Prometheus 2.x on any deployment; Thanos, Cortex, Grafana Mimir remote-write endpoints; Prometheus Pushgateway; any application using Prometheus client libraries with high-cardinality label patterns.


Threat Model

Adversary 1 — Pushgateway bombardment from compromised pod. Access level: code execution in a pod with network access to the monitoring namespace. Objective: POST metrics with 100,000 unique label values to the Pushgateway, causing Prometheus to allocate ~1 GB of memory per 100,000 time series, OOM the Prometheus pod, and eliminate monitoring visibility.

Adversary 2 — Application-layer cardinality injection. Access level: ability to send HTTP requests to a monitored application. Objective: send requests with unique path parameters (UUIDs, tokens) that become metric labels, creating unbounded time series without touching the metrics infrastructure directly.

Adversary 3 — Remote-write endpoint injection. Access level: network access to a Thanos Receiver or Mimir ingestion endpoint (common in multi-tenant environments). Objective: inject millions of time series via remote-write, overwhelming the TSDB and causing ingestion backpressure that stops legitimate metrics from being recorded.

Adversary 4 — Metrics scrape endpoint poisoning. Access level: ability to respond to a Prometheus scrape (e.g., via ARP spoofing or a compromised exporter). Objective: return a metrics payload with 50,000 unique label values on each scrape interval, exhausting Prometheus memory over time.


Configuration / Implementation

Step 1 — Set per-target and global time series limits

# prometheus.yml — enforce series limits
global:
  scrape_interval: 15s
  scrape_timeout: 10s
  
  # Global limit on number of accepted samples per scrape
  # Sample limit helps but doesn't prevent cardinality explosion over time
  # Use in addition to series limits

scrape_configs:
- job_name: 'application'
  static_configs:
  - targets: ['app:8080']
  
  # Per-target time series limit
  # Scrapes that would exceed this are rejected entirely
  sample_limit: 10000
  
  # Limit on unique label names per scrape target
  label_name_length_limit: 256
  label_value_length_limit: 1024

# Global limit — Prometheus 2.40+
# Reject scraped metrics that would create too many series
limit_config:
  # Max time series per scrape job target
  sample_limit: 5000

For Prometheus 2.45+, use the storage-level limit:

# prometheus.yml
storage:
  # Global limit on active time series
  # Prometheus will reject new series beyond this limit
  # rather than OOMing
  # Set to 2× expected series count for headroom
  tsdb:
    out_of_order_time_window: 10m

Start Prometheus with series limit flags:

prometheus \
  --storage.tsdb.path=/var/lib/prometheus \
  --storage.tsdb.retention.time=15d \
  --query.max-samples=50000000 \
  --web.max-connections=512
  # --storage.tsdb.max-block-duration and min-block-duration affect cardinality too

Step 2 — Authenticate and rate-limit the Pushgateway

# Deploy Pushgateway behind nginx with authentication
cat > /etc/nginx/conf.d/pushgateway.conf << 'EOF'
upstream pushgateway {
    server localhost:9091;
}

server {
    listen 9092;
    
    # Basic authentication
    auth_basic "Pushgateway";
    auth_basic_user_file /etc/nginx/.htpasswd;
    
    # Rate limit metric pushes
    limit_req_zone $binary_remote_addr zone=pushgw:10m rate=10r/m;
    limit_req zone=pushgw burst=5;
    
    # Limit request body size (prevent huge metric payloads)
    client_max_body_size 1m;
    
    location / {
        proxy_pass http://pushgateway;
    }
}
EOF

# Generate credentials
htpasswd -c /etc/nginx/.htpasswd monitoring-writer

systemctl reload nginx

Apply NetworkPolicy to restrict who can reach the Pushgateway:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: pushgateway-access-control
  namespace: monitoring
spec:
  podSelector:
    matchLabels:
      app: pushgateway
  policyTypes: [Ingress]
  ingress:
  # Only allow from approved namespaces
  - from:
    - namespaceSelector:
        matchLabels:
          monitoring-write-access: "true"
    ports:
    - port: 9091
      protocol: TCP
  # Allow Prometheus to scrape
  - from:
    - podSelector:
        matchLabels:
          app: prometheus
    ports:
    - port: 9091

Step 3 — Monitor cardinality and alert on explosion

# Prometheus alerting rules for cardinality monitoring

groups:
- name: prometheus_cardinality
  rules:
  # Alert when total time series count grows rapidly
  - alert: PrometheusHighCardinalityGrowth
    expr: |
      rate(prometheus_tsdb_head_series[5m]) * 300 > 10000
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Prometheus series count growing rapidly: {{ $value | humanize }} new series/5m"
      description: "Check for high-cardinality metrics being scraped or pushed"

  # Alert when a single job creates too many series
  - alert: JobHighCardinality
    expr: |
      sum by (job) (prometheus_tsdb_head_series_not_yet_removed) > 50000
    labels:
      severity: warning
    annotations:
      summary: "Job {{ $labels.job }} has {{ $value | humanize }} active series"

  # Alert when Prometheus memory usage is high (cardinality indicator)
  - alert: PrometheusHighMemoryUsage
    expr: |
      process_resident_memory_bytes{job="prometheus"} /
      (1024 * 1024 * 1024) > 8
    for: 10m
    labels:
      severity: critical
    annotations:
      summary: "Prometheus using {{ $value | humanize }}GB of memory — possible cardinality attack"

  # Alert when scrape limit is being hit
  - alert: ScrapeSampleLimitHit
    expr: |
      sum(prometheus_target_scrapes_sample_limit_hit_total) by (job) > 0
    labels:
      severity: warning
    annotations:
      summary: "Job {{ $labels.job }} hitting sample limit — review cardinality"

Step 4 — Identify high-cardinality metrics in existing data

# Find the top-cardinality metrics in your Prometheus instance
# Query the TSDB via the Prometheus API

curl -s "http://prometheus:9090/api/v1/query" \
  --data-urlencode 'query=topk(20, count by (__name__)({__name__!=""}))' | \
  jq -r '.data.result[] | "\(.value[1]) \(.metric.__name__)"' | \
  sort -rn | head -20

# Find metrics with high label cardinality
curl -s "http://prometheus:9090/api/v1/query" \
  --data-urlencode 'query=sort_desc(count by (__name__, job)(count by (job, __name__, pod)({__name__!=""})))' | \
  jq '.data.result[:10]'

# Use mimirtool for deeper cardinality analysis
# Install: go install github.com/grafana/mimir/pkg/mimirtool@latest
mimirtool analyze prometheus \
  --address=http://prometheus:9090 \
  --output cardinality-report.json

jq '.metrics | sort_by(-.series_count) | .[:20] | 
  .[] | {metric: .metric_name, series: .series_count, labels: .label_names}' \
  cardinality-report.json

Step 5 — Add cardinality controls to application instrumentation

# Python — implement cardinality-safe metric labels

from prometheus_client import Counter, Histogram
import hashlib

# BAD: Using full path as label (unbounded cardinality)
# requests_total = Counter('http_requests_total', 'Total requests', ['path'])
# requests_total.labels(path=request.path).inc()  # Unique per URL!

# GOOD: Normalise path to a known pattern set
KNOWN_PATHS = {
    "/api/users": "/api/users",
    "/api/products": "/api/products",
    "/health": "/health",
}

def normalise_path(path: str) -> str:
    """Replace dynamic path segments with a placeholder."""
    import re
    # Replace UUIDs and numeric IDs with placeholders
    normalised = re.sub(r'/[0-9a-f-]{8,36}', '/:id', path)
    normalised = re.sub(r'/\d+', '/:id', normalised)
    
    # If the normalised path is still not in known paths, use a catch-all
    return KNOWN_PATHS.get(normalised, "/other")

# GOOD: Cardinality-safe instrumentation
requests_total = Counter(
    'http_requests_total',
    'Total HTTP requests',
    ['method', 'path', 'status']
)

def track_request(method: str, path: str, status: int):
    requests_total.labels(
        method=method,
        path=normalise_path(path),  # Bounded cardinality
        status=str(status)
    ).inc()

Expected Behaviour

Signal Before hardening After hardening
sample_limit per scrape target Not set (unlimited) 10,000 samples per scrape; excess rejected
Pushgateway accepts unauthenticated POSTs Yes nginx proxy requires credentials
Time series growth rate alert Not configured Alert fires when >10,000 new series/5min
Compromised pod pushes 100,000 label values Prometheus OOMs NetworkPolicy blocks pod from reaching Pushgateway
Application creates per-user metrics Unbounded cardinality Path normalisation caps label values

Verification:

# Check current series count
curl -s "http://prometheus:9090/api/v1/query" \
  --data-urlencode 'query=prometheus_tsdb_head_series' | \
  jq '.data.result[0].value[1]'

# Verify sample_limit is enforced
curl -s "http://prometheus:9090/api/v1/targets" | \
  jq '.data.activeTargets[] | select(.scrapePool == "application") | .health'

# Verify Pushgateway rejects unauthenticated requests
curl -X POST http://pushgateway:9092/metrics/job/test 2>&1 | grep -i "401\|auth"

Trade-offs

Aspect Benefit Cost Mitigation
Sample limit per scrape Hard cap on per-target series growth Scrapes that hit the limit are entirely rejected; missing data Set limit to 2× expected series count; alert on limit hits so limits can be adjusted
Pushgateway authentication Prevents anonymous cardinality injection All Pushgateway clients must authenticate Use service account tokens or an internal CA; automate credential distribution
Path normalisation in application Bounded metric cardinality May lose some debugging precision (all IDs become :id) Retain a separate high-cardinality trace-level metric for debugging; use exemplars
NetworkPolicy restricting Pushgateway access Only approved namespaces can push New services need namespace label to push metrics Document the onboarding process; add the label as part of namespace creation automation

Failure Modes

Failure Symptom Detection Recovery
Sample limit set too low Legitimate metrics dropped; dashboards show gaps Alert: ScrapeSampleLimitHit; missing series in dashboards Increase sample limit for the affected job; investigate why the target has more series than expected
Cardinality attack before limits are set Prometheus OOMs; all monitoring goes dark Nothing — monitoring is down Restart Prometheus with --storage.tsdb.retention.time=1h to compact; reduce series; add limits; restart normally
Path normalisation changes a metric that was being alerted on Alert stops firing because label value changed Alert test fails; on-call investigates Update alert queries to use the normalised path format; document metric label conventions
Rate limit on Pushgateway breaks legitimate batch job Batch job metrics not received; Pushgateway returns 429 Job completion metrics missing; batch job succeeds but no telemetry Increase rate limit for authenticated clients from batch job service accounts