Envoy WASM Filters for API Security: Injection-Safe Middleware in the Data Plane

The Problem

Envoy processes every request through a filter chain: listener filters handle connection-level concerns, network filters handle protocol framing, and HTTP filters handle request and response semantics. The built-in HTTP filters — envoy.filters.http.jwt_authn, envoy.filters.http.ratelimit, envoy.filters.http.router — are compiled directly into the Envoy binary in C++. When one of those filters has a bug, it runs in the same memory space as the rest of Envoy. A heap-use-after-free in a C++ filter does not stay contained to that filter. It corrupts the Envoy process.

The WASM filter extension (envoy.filters.http.wasm) changes this boundary. A WASM module loaded as an HTTP filter runs inside an embedded WebAssembly VM — either V8, Wasmtime, or WAMR depending on Envoy build configuration. The WASM linear memory model means the module has its own isolated heap. A logic error, unbounded allocation, or memory corruption inside the WASM module cannot reach into Envoy’s own heap. Envoy crashes the WASM VM, not itself.

This isolation makes WASM the right deployment model for security logic you do not entirely trust: third-party rate limiting implementations, tenant-supplied validation rules, policy logic that you want to update without recompiling Envoy, or security filters written by teams who are not C++ engineers. The filter can inspect and modify request and response headers, read and modify the request or response body, make external HTTP calls via the dispatch_http_call host function, maintain shared state across filter instances within the same Envoy worker via the shared data API, and respond directly to the client without forwarding to the upstream.

The security-relevant use cases are concrete:

Input validation: validate request bodies against a size limit, character set restrictions, or structural rules at the proxy layer, before the request reaches your application. No network round-trip to an external validation service; the filter runs inline.
Token-bucket rate limiting: a rate limiter that uses Envoy’s shared data API to maintain per-client counters across filter instances on the same worker. Per-route, per-tenant, per-operation — scoped as narrowly as needed.
JWT claim inspection: decode the JWT in the Authorization header, extract the tenant ID or scope set, enforce claim-based routing or rejection without forwarding to an external auth service.
Schema enforcement: enforce additionalProperties: false or field-level type constraints on the JSON body at the proxy layer, preventing novel fields from reaching upstream services where they might trigger unexpected code paths.

None of this is free. The WASM ABI between the filter and Envoy is itself an attack surface. The CVE record for Envoy includes bugs on both sides of this boundary, and understanding them is prerequisite to trusting WASM filters for security enforcement.

CVE Patterns in WASM Envoy Filters

The WASM sandbox protects Envoy from filter bugs. The host functions the filter calls — the ABI between WASM and Envoy — run in Envoy’s own C++ code. Every dispatch_http_call, get_property, set_header, and continueRequest is implemented in Envoy, not in the filter. Vulnerabilities in these host functions are vulnerabilities in Envoy itself, exploitable by any filter that makes the wrong call sequence.

CVE-2023-27487 (Envoy 1.25.x, CVSS 8.6): A WASM filter that called continueRequest() after the downstream connection had been reset triggered a heap-use-after-free in Envoy’s WASM host code. The filter’s call to a legitimate host function caused Envoy — not the filter — to access freed memory. The bug was not in any filter logic. It was in the lifecycle management of the C++ context that the host functions operate on. Envoy crashed.

CVE-2022-21655 (Envoy, CVSS 7.5): A WASM filter that set a local reply code via send_local_reply could manipulate response status codes in ways Envoy did not correctly account for in its internal state machine. A filter could set a response code that conflicted with Envoy’s own routing decisions, producing responses with inconsistent status codes — a mechanism for a malicious filter to suppress error codes or misrepresent the outcome of upstream calls.

CVE-2021-29492 (Envoy, CVSS 8.8): A WASM filter could call clearRouteCache() to invalidate Envoy’s cached route decision, then set headers that caused Envoy to re-select a different route. If the re-selected route had weaker or absent authentication requirements, the filter bypassed authentication enforcement. A WASM filter acting as an HTTP client for one route could clear the route cache and cause Envoy to forward the request to a protected internal service without requiring authentication.

These three CVEs share a pattern: the exploited mechanism is a host function that the filter is legitimately permitted to call, used in an ordering or combination that Envoy’s host-function implementation did not defend against. The lesson is not that WASM filters are unsafe. It is that the ABI between filter and host is a bidirectional trust boundary that requires the same scrutiny as any API boundary in a security-critical system.

Threat Model

Vulnerable WASM filter panics the VM. A filter with a logic error — division by zero, out-of-bounds access within its own linear memory, stack overflow from unbounded recursion — triggers a VM fault. Envoy resets the filter’s VM. Depending on the fail_open vs. fail_closed configuration, in-flight requests through that filter either pass unenforced or are dropped. This is a denial-of-enforcement condition rather than a code execution condition.

WASM filter with CVE-2021-29492-style auth bypass. A filter that calls clearRouteCache() after selectively modifying headers can redirect requests to protected upstream services without valid credentials. The filter does not need to steal tokens or forge signatures. It manipulates Envoy’s own routing logic into making the wrong forwarding decision. Requests reach protected upstreams without authentication.

Malicious WASM filter loaded via supply chain compromise. An attacker with write access to the OCI registry or the WASM binary storage location substitutes a modified filter. The modified filter calls dispatch_http_call on every inbound request, exfiltrating the Authorization header, the request body, and any custom headers to an external endpoint over HTTPS. The outbound call is made by Envoy’s own HTTP client (not by code running in the WASM sandbox) and is therefore not blocked by the same network policy that restricts the upstream workload.

WASM filter with unbounded memory allocation causes Envoy OOM. A filter that allocates memory proportional to the request body size, without a cap, on every request, fills the Envoy worker’s heap. Envoy is OOM-killed. Data plane is unavailable. Kubernetes restarts the pod; the filter loads again; the allocations resume. Without a per-VM memory limit, this cycle continues.

Hardening Configuration

1. Rate Limiting WASM Filter in Rust

The proxy-wasm Rust crate provides the ABI bindings. The filter implements the HttpContext trait and overrides on_http_request_headers to read the client identifier and enforce a token-bucket limit using Envoy’s shared data API.

# rate-limiter/Cargo.toml
[package]
name = "rate-limiter"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
proxy-wasm = "0.2"

[profile.release]
lto = true
opt-level = "z"
strip = true
panic = "abort"

// rate-limiter/src/lib.rs
use proxy_wasm::traits::*;
use proxy_wasm::types::*;

proxy_wasm::main! {{
    proxy_wasm::set_http_context(|_, _| -> Box<dyn HttpContext> {
        Box::new(RateLimiter {
            limit: 100,
            window_secs: 60,
        })
    });
}}

struct RateLimiter {
    limit: u64,
    window_secs: u64,
}

impl Context for RateLimiter {}

impl HttpContext for RateLimiter {
    fn on_http_request_headers(&mut self, _num_headers: usize, _end_of_stream: bool) -> Action {
        // Prefer x-forwarded-for; fall back to a static key for local testing
        let client_ip = self
            .get_http_request_header("x-forwarded-for")
            .and_then(|v| {
                // x-forwarded-for may be "ip1, ip2, ip3" — take the first (client-most) entry
                v.split(',').next().map(|s| s.trim().to_string())
            })
            .unwrap_or_else(|| "unknown".to_string());

        let key = format!("rl:{}:{}",
            client_ip,
            // Bucket by floor(unix_time / window_secs) — approximates a fixed window
            self.get_current_time_nanos() / 1_000_000_000 / self.window_secs,
        );

        // shared_data provides cross-instance state within the same Envoy worker.
        // The second return value is a CAS token for optimistic locking.
        let (current_bytes, cas) = self.get_shared_data(&key);

        let current_count: u64 = current_bytes
            .as_deref()
            .and_then(|b| b.try_into().ok())
            .map(u64::from_be_bytes)
            .unwrap_or(0);

        if current_count >= self.limit {
            self.send_http_response(
                429,
                vec![
                    ("content-type", "application/json"),
                    ("retry-after", &self.window_secs.to_string()),
                    ("x-ratelimit-limit", &self.limit.to_string()),
                    ("x-ratelimit-remaining", "0"),
                ],
                Some(br#"{"error":"rate_limit_exceeded","message":"Too many requests"}"#),
            );
            return Action::Pause;
        }

        let new_count = (current_count + 1).to_be_bytes();
        // Pass CAS token to detect concurrent increments on the same key.
        // If another filter instance has written between our read and write,
        // set_shared_data returns WasmError::CasMismatch — we let the request
        // through rather than retrying, accepting slight over-counting at high
        // concurrency as preferable to introducing latency.
        let _ = self.set_shared_data(&key, Some(&new_count), cas);

        Action::Continue
    }
}

Build the filter:

# Install the wasm32 target once
rustup target add wasm32-unknown-unknown

# Build
cargo build \
  --manifest-path rate-limiter/Cargo.toml \
  --target wasm32-unknown-unknown \
  --release

# Output: rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm

# Verify it is a valid WASM module (wasm-tools available via cargo install wasm-tools)
wasm-tools validate rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm

# Generate the SHA256 for Envoy's integrity check
sha256sum rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm

The panic = "abort" profile option is important for WASM. The default panic handler in Rust WASM produces a wasm-bindgen unwinding mechanism that does not work correctly in the proxy-wasm ABI. panic = "abort" causes any panic to immediately trap the WASM module, which Envoy handles as a VM fault rather than an uncontrolled state corruption.

2. Input Validation WASM Filter in Rust

This filter runs on the request body path. It enforces a maximum size, UTF-8 validity, and null byte rejection. Each check is a separate early-return to produce a specific error response rather than a generic 400.

// input-validator/src/lib.rs
use proxy_wasm::traits::*;
use proxy_wasm::types::*;

const MAX_BODY_BYTES: usize = 102_400; // 100 KiB

proxy_wasm::main! {{
    proxy_wasm::set_http_context(|_, _| -> Box<dyn HttpContext> {
        Box::new(InputValidator)
    });
}}

struct InputValidator;

impl Context for InputValidator {}

impl HttpContext for InputValidator {
    fn on_http_request_body(&mut self, body_size: usize, end_of_stream: bool) -> Action {
        // Envoy calls this callback repeatedly as body chunks arrive.
        // Do not run validation logic on partial bodies — wait for end_of_stream.
        // Without this check, a filter that rejects on partial data will reject
        // legitimate requests mid-stream, and a filter that counts on the full body
        // for decisions will produce incorrect results.
        if !end_of_stream {
            return Action::StopIteration;
        }

        // Body size check runs before reading the bytes — avoids materialising
        // a 10 MiB body into WASM linear memory just to reject it.
        if body_size > MAX_BODY_BYTES {
            self.send_http_response(
                413,
                vec![("content-type", "application/json")],
                Some(br#"{"error":"payload_too_large","max_bytes":102400}"#),
            );
            return Action::Pause;
        }

        let body = match self.get_http_request_body(0, body_size) {
            Some(b) => b,
            None => return Action::Continue, // No body — nothing to validate
        };

        // UTF-8 check: reject binary content masquerading as JSON.
        // Many injection payloads use invalid UTF-8 sequences to bypass
        // string-level WAF rules while reaching parsers that accept arbitrary bytes.
        if std::str::from_utf8(&body).is_err() {
            self.send_http_response(
                400,
                vec![("content-type", "application/json")],
                Some(br#"{"error":"invalid_encoding","message":"Request body must be valid UTF-8"}"#),
            );
            return Action::Pause;
        }

        // Null byte check: null bytes in JSON strings are valid UTF-8 but cause
        // unexpected behaviour in C-string APIs upstream. A null byte in a
        // filename field, for example, can truncate the path at the OS level.
        if body.contains(&0u8) {
            self.send_http_response(
                400,
                vec![("content-type", "application/json")],
                Some(br#"{"error":"invalid_content","message":"Null bytes are not permitted"}"#),
            );
            return Action::Pause;
        }

        // Content-type enforcement: if content-type is application/json,
        // verify the body starts with a JSON structural character.
        // This does not validate the full JSON structure — that would require
        // a JSON parser compiled into the WASM module — but it rejects
        // obvious content-type mismatches that indicate request smuggling or
        // misconfigured clients sending form data to JSON endpoints.
        let content_type = self
            .get_http_request_header("content-type")
            .unwrap_or_default();

        if content_type.contains("application/json") {
            let first_non_ws = body.iter().find(|&&b| !b.is_ascii_whitespace());
            match first_non_ws {
                Some(&b'{') | Some(&b'[') | Some(&b'"') | Some(&b't') | Some(&b'f') | Some(&b'n') | Some(&b'0'..=b'9') => {},
                _ => {
                    self.send_http_response(
                        400,
                        vec![("content-type", "application/json")],
                        Some(br#"{"error":"invalid_content_type","message":"Body does not appear to be valid JSON"}"#),
                    );
                    return Action::Pause;
                }
            }
        }

        Action::Continue
    }
}

3. Load the WASM Filters via Istio EnvoyFilter

The EnvoyFilter resource patches the sidecar’s HTTP filter chain. The WASM filter is inserted before the router filter so it runs on every inbound request. The sha256 field is mandatory for production — Envoy verifies the hash before loading the module and refuses to execute a module whose hash does not match.

# rate-limiter-wasm.yaml
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: rate-limiter-wasm
  namespace: production
spec:
  workloadSelector:
    labels:
      app: payment-api  # Scope to a single workload, not mesh-wide
  configPatches:
  - applyTo: HTTP_FILTER
    match:
      context: SIDECAR_INBOUND
      listener:
        filterChain:
          filter:
            name: envoy.filters.network.http_connection_manager
            subFilter:
              name: envoy.filters.http.router
    patch:
      operation: INSERT_BEFORE
      value:
        name: envoy.filters.http.wasm
        typed_config:
          "@type": type.googleapis.com/udpa.type.v1.TypedStruct
          type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
          value:
            config:
              name: rate_limiter
              fail_open: false      # fail_closed: reject requests if the VM crashes
              vm_config:
                vm_id: rate-limiter-vm
                runtime: envoy.wasm.runtime.v8
                allow_precompiled: false  # Do not load cached precompiled code
                code:
                  remote:
                    http_uri:
                      uri: https://artifacts.internal/wasm/rate_limiter.wasm
                      cluster: internal_artifacts
                      timeout: 10s
                    sha256: "7a3f9c2b1d8e4f6a0c5b2e9d1f3a7c4b8e2f6d0a9c3b7e1f4d8a2c6f0e9b3d5a7"
              configuration:
                "@type": type.googleapis.com/google.protobuf.StringValue
                value: |
                  {"limit_per_minute": 100, "window_secs": 60}

# input-validator-wasm.yaml
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: input-validator-wasm
  namespace: production
spec:
  workloadSelector:
    labels:
      app: payment-api
  configPatches:
  - applyTo: HTTP_FILTER
    match:
      context: SIDECAR_INBOUND
      listener:
        filterChain:
          filter:
            name: envoy.filters.network.http_connection_manager
            subFilter:
              name: envoy.filters.http.router
    patch:
      operation: INSERT_BEFORE
      value:
        name: envoy.filters.http.wasm
        typed_config:
          "@type": type.googleapis.com/udpa.type.v1.TypedStruct
          type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
          value:
            config:
              name: input_validator
              fail_open: false
              vm_config:
                vm_id: input-validator-vm
                runtime: envoy.wasm.runtime.v8
                allow_precompiled: false
                code:
                  remote:
                    http_uri:
                      uri: https://artifacts.internal/wasm/input_validator.wasm
                      cluster: internal_artifacts
                      timeout: 10s
                    sha256: "4b8e2f6d0a9c3b7e1f4d8a2c6f0e9b3d5a7f9c2b1d8e4f6a0c5b2e9d1f3a7c4b"
              configuration:
                "@type": type.googleapis.com/google.protobuf.StringValue
                value: |
                  {"max_body_bytes": 102400}

The filter insertion order matters. Both filters are inserted before the router. Envoy processes HTTP filters in the order they appear in the chain. Insert the rate limiter first (as the outermost check) so rate-limited requests are rejected before the validator allocates memory to read the body.

4. Verify WASM Module Integrity Before Deploying

Envoy’s sha256 field in the remote code block is the first integrity gate. Generate it from the exact binary that will be served:

# Generate SHA256 from the release binary
WASM_SHA=$(sha256sum rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm | awk '{print $1}')
echo "sha256: \"${WASM_SHA}\""

# Sign the binary with cosign for provenance tracking
cosign sign-blob \
  --key cosign.key \
  --output-signature rate_limiter.wasm.sig \
  rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm

# Verify before uploading to artifact storage
cosign verify-blob \
  --key cosign.pub \
  --signature rate_limiter.wasm.sig \
  rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm

# Upload to internal artifact storage (example: GCS)
gsutil cp rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm \
  gs://internal-wasm-artifacts/rate_limiter_${WASM_SHA:0:12}.wasm

# The URI in the EnvoyFilter should reference the content-addressed path,
# not a mutable "latest" path:
# uri: https://artifacts.internal/wasm/rate_limiter_7a3f9c2b1d8e.wasm

The SHA check in the EnvoyFilter is the second gate: even if an attacker substitutes the file at the storage location, Envoy refuses to load the module if the hash does not match. Without the SHA field, Envoy loads whatever bytes the URI returns. A mutable URI without a SHA check is equivalent to loading an unverified third-party script.

5. Restrict Envoy Egress with NetworkPolicy

The dispatch_http_call host function allows a WASM filter to make outbound HTTP requests from inside Envoy. A malicious filter uses this to exfiltrate request headers. The host function itself cannot be disabled via Envoy configuration — the ABI exposes it to all filters. The mitigation is at the network layer: restrict which external endpoints Envoy itself can reach.

# Restrict Envoy sidecar egress to declared cluster upstreams only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-api-egress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: payment-api
  policyTypes:
  - Egress
  egress:
  # Allow egress to the upstream service (within cluster)
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: production
      podSelector:
        matchLabels:
          app: payment-backend
    ports:
    - protocol: TCP
      port: 8080
  # Allow egress to the WASM artifact storage (for filter loading at startup)
  - to:
    - ipBlock:
        cidr: 10.100.0.0/24  # Internal artifact storage CIDR
    ports:
    - protocol: TCP
      port: 443
  # Allow DNS
  - to:
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
  # Allow Istio control plane communication
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: istio-system
    ports:
    - protocol: TCP
      port: 15012
    - protocol: TCP
      port: 15010

This policy does not prevent a WASM filter from calling dispatch_http_call — the ABI call succeeds from the filter’s perspective. What it ensures is that the resulting HTTP request from Envoy is dropped by the network layer before it leaves the node. A filter that attempts to exfiltrate data to attacker.example.com will see the call time out rather than succeed. Combined with alerting on unexpected Envoy egress connections (via network flow logs or eBPF-based monitoring), this provides both prevention and detection.

6. Set WASM VM Memory Limits in Envoy Bootstrap

Per-VM memory limits prevent a runaway filter from OOM-killing the Envoy worker. In standalone Envoy (non-Istio), set these in the bootstrap config:

# envoy-bootstrap.yaml (relevant section)
static_resources:
  listeners:
  - name: inbound
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 8080
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          http_filters:
          - name: envoy.filters.http.wasm
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
              config:
                name: rate_limiter
                fail_open: false
                vm_config:
                  vm_id: rate-limiter-vm
                  runtime: envoy.wasm.runtime.v8
                  allow_precompiled: false
                  # Memory limit for this WASM VM instance.
                  # Default is unbounded. A filter that allocates proportional
                  # to request body size can exhaust this limit and be terminated
                  # by Envoy rather than OOM-killing the worker.
                  initial_memory_pages: 16    # 1 MiB (each page is 64 KiB)
                  # Note: max_memory_pages is not a direct Envoy config field —
                  # set it via the runtime's vm_config environment or use
                  # Wasmtime runtime which supports memory.maximum in the WASM
                  # module itself. For V8, cap memory at the module level via
                  # wasm-opt --memory-max-pages=256 (16 MiB).
                  code:
                    local:
                      filename: /etc/envoy/wasm/rate_limiter.wasm

For Wasmtime-based builds of Envoy, the memory.maximum directive in the WASM binary itself is enforced at the runtime level. Compile your WASM modules with an explicit memory limit:

# Set maximum memory pages in the WASM binary (256 pages = 16 MiB)
wasm-opt \
  --memory-packing \
  --memory-max-pages=256 \
  -O3 \
  rate-limiter/target/wasm32-unknown-unknown/release/rate_limiter.wasm \
  -o rate_limiter_hardened.wasm

# Verify the memory section
wasm-tools dump rate_limiter_hardened.wasm | grep -A2 "memory"

7. Monitor WASM Filter Behaviour via Envoy Admin API

The Envoy admin API exposes WASM VM metrics. Scrape these into your observability stack and alert on anomalies.

# Query Envoy admin for WASM stats (admin bound to localhost:15000 in Istio)
kubectl exec -n production deployment/payment-api -c istio-proxy -- \
  curl -s localhost:15000/stats | grep wasm

# Key metrics to monitor:
# wasm.envoy_wasm.rate_limiter.active               — number of active WASM contexts
# wasm.envoy_wasm.rate_limiter.created              — total contexts created
# wasm.runtime.v8.rate-limiter-vm.active            — active VM instances
# wasm.runtime.v8.rate-limiter-vm.compile_time_us   — compile time on load
# wasm.runtime.v8.rate-limiter-vm.execution_time_us — execution time per request

# Check filter-level error counts (VM resets, panics):
kubectl exec -n production deployment/payment-api -c istio-proxy -- \
  curl -s localhost:15000/stats | grep -E 'wasm.*(_error|_reset|_panic)'

Alert thresholds:

# Prometheus alerting rules for WASM filter health
groups:
- name: wasm-filter-health
  rules:
  - alert: WasmFilterVMReset
    expr: increase(wasm_runtime_v8_rate_limiter_vm_vm_resets_total[5m]) > 0
    for: 0m
    severity: critical
    annotations:
      summary: "WASM filter VM has been reset — filter is crashing"
      description: "A VM reset means requests are failing closed (if fail_open: false) or bypassing the filter (if fail_open: true). Investigate the filter log for the panic message."

  - alert: WasmFilterHighLatency
    expr: histogram_quantile(0.99,
            rate(wasm_runtime_v8_rate_limiter_vm_execution_time_us_bucket[5m])
          ) > 10000
    for: 2m
    severity: warning
    annotations:
      summary: "WASM filter p99 execution time exceeds 10ms"
      description: "WASM filters run synchronously in the request path. Latency above 10ms at p99 affects every request. Check for unbounded loops or large shared-data reads in the filter logic."

  - alert: WasmFilterRateLimitSpiking
    expr: increase(envoy_http_downstream_rq_4xx[5m]) > 500
    for: 1m
    severity: warning
    annotations:
      summary: "High rate of 4xx responses — possible rate limiter misconfiguration or attack"

Expected Behaviour

Normal operation — rate limit not reached: The rate limiter filter’s on_http_request_headers reads the shared data counter for the client IP, finds it below 100, increments it, and returns Action::Continue. The input validator’s on_http_request_body waits for end_of_stream, reads the body, passes all checks, and returns Action::Continue. Total filter latency contribution is under 1ms for both filters combined on a payload under 10 KiB.

Rate limit exceeded: The client receives:

HTTP/1.1 429 Too Many Requests
content-type: application/json
retry-after: 60
x-ratelimit-limit: 100
x-ratelimit-remaining: 0

{"error":"rate_limit_exceeded","message":"Too many requests"}

The request does not reach the upstream. The Envoy stats counter envoy_http_downstream_rq_4xx increments. No upstream connection is opened.

Input validation failure — body too large: The client receives:

HTTP/1.1 413 Content Too Large
content-type: application/json

{"error":"payload_too_large","max_bytes":102400}

The body bytes are never materialised in WASM linear memory — the check runs against body_size before calling get_http_request_body.

WASM filter VM crash (panic or trap): With fail_open: false, Envoy resets the VM context and returns a 500 to the client. The VM reset counter increments. Envoy logs a message at the critical level including the WASM trap type. Subsequent requests through the filter fail closed until the VM is re-initialised (which Envoy does automatically on the next request). During the re-initialisation window, all requests through that filter chain fail with 500. This is the intended behaviour: a crashing security filter should not transparently pass traffic.

With fail_open: true, a crashing VM passes requests through without filter enforcement. This setting is appropriate only for observability filters, never for security enforcement filters.

Trade-offs

WASM filter vs. ext_authz. The ext_authz filter calls an external gRPC or HTTP service for each request. It is language-agnostic, supports complex policy logic, and can be updated without touching the filter chain config. The trade-off is latency: an ext_authz call adds at minimum one network round-trip per request (typically 1–5ms on a healthy in-cluster service). WASM filters run in-process — no network call, no serialisation overhead. For simple logic like token-bucket rate limiting or body size checks, WASM latency is under 200µs. For complex policy that requires querying a database or evaluating hundreds of rules, ext_authz with OPA is better. For JWT validation where the signing keys can be cached in WASM shared data, WASM avoids the network call entirely.

Shared data for rate limiting is per-worker-thread. Envoy’s shared data API provides in-memory state shared across all WASM contexts within the same VM. In Envoy’s threaded model, each worker thread has its own WASM VM. A deployment with 4 worker threads and 1,000 req/s will see the rate limit applied per-thread, not globally — each thread sees at most 250 req/s, and the effective per-client limit is limit * num_workers. For truly global rate limiting, use Redis via ext_proc or use the built-in Envoy rate limit service with a Redis backend. WASM shared data rate limiting is appropriate for coarse-grained protection and cases where slight over-counting is acceptable.

Rust is the right language for proxy-wasm. TinyGo compiles to WASM and supports a subset of the proxy-wasm ABI, but the garbage collector runs inside the WASM linear memory and produces non-deterministic pause times in the request path. C++ with the Envoy SDK is the original implementation language but requires managing memory manually in a context where a buffer overflow in the filter cannot corrupt Envoy’s heap — it can only corrupt the filter’s own linear memory. Rust provides memory safety without a GC, deterministic latency, and the proxy-wasm crate has the most complete ABI coverage of any language binding as of 2026.

allow_precompiled: false trades startup time for security. Envoy can cache precompiled WASM bytecode to avoid recompiling the module on each startup. Precompiled code is stored on disk and loaded without re-verification of the source WASM. With allow_precompiled: false, Envoy compiles from the WASM bytecode every time. For a 500 KiB WASM module, compilation takes 50–200ms. For a sidecar that restarts infrequently, this is acceptable. For a sidecar with a 100ms startup time budget, measure the compilation time with your specific module and runtime before disabling precompilation.

Failure Modes

WASM filter loaded without sha256 verification. The sha256 field in the remote code block is optional in Envoy’s configuration schema. Omitting it means Envoy fetches and executes whatever bytes the URI returns. An attacker with write access to the artifact storage location — or positioned to perform a MITM attack on the fetch — can substitute a malicious module. The SHA field is the binding between the filter logic you reviewed and the binary that executes in production. It is not optional for security filters.

dispatch_http_call exfiltration without network egress controls. A WASM filter that calls dispatch_http_call makes the HTTP request using Envoy’s own HTTP client. The request appears in network flow logs as originating from the Envoy process, not from the workload container. Without NetworkPolicy restricting Envoy’s egress, the exfiltration call succeeds silently. The filter has access to every request header (Authorization, Cookie, X-API-Key) and the full request body. The combination of supply chain compromise on the WASM artifact and unrestricted Envoy egress provides a complete exfiltration path.

Missing end_of_stream check in the body callback. The on_http_request_body callback is invoked once per body chunk, not once per request. For a 500 KiB body delivered in 10 chunks, Envoy calls the callback 10 times. Without the if !end_of_stream { return Action::StopIteration; } guard, a filter that rejects on partial body state will reject legitimate requests after the first chunk. A filter that runs validation logic on partial bodies will produce incorrect results — a JSON structure check on the first 16 KiB of a 100 KiB body will always fail because the JSON is incomplete. The StopIteration return tells Envoy to buffer the body and call the callback again when more data arrives, with the final call setting end_of_stream: true.

No memory cap results in Envoy OOM. A filter that calls get_http_request_body(0, body_size) for a 50 MiB upload allocates 50 MiB in WASM linear memory. Without the size check before the read, and without a memory limit on the WASM VM, this allocation succeeds and the WASM linear memory grows to 50 MiB. Multiply by concurrent requests and by the number of WASM context instances Envoy maintains, and the worker’s heap fills. The size check before get_http_request_body is not optional — it is the guard that prevents the read from happening at all for oversized bodies.

fail_open: true on a security filter. Operators sometimes set fail_open: true during initial rollout to avoid 500 errors if the filter has bugs. This is correct practice for a feature flag or telemetry filter, where failing open means the request proceeds normally. For a rate limiter or input validator, failing open means that a crashing or misconfigured filter transparently bypasses the security control it is supposed to enforce. Use fail_open: false for all security filters from the first deployment. Test with a known-bad WASM module to verify that Envoy correctly fails closed before promoting to production.