WASM Host Function Security: Hardening the WASM-to-Host Boundary

WASM Host Function Security: Hardening the WASM-to-Host Boundary

The Problem

The WebAssembly sandbox model provides one strong guarantee: a module cannot directly access host memory, invoke system calls, or manipulate host data structures. Every resource the module touches — files, sockets, clocks, random number generators — must be explicitly granted through imported host functions. The sandbox does not protect the host. The host API you expose to the module defines exactly what the module can do, whether that module is trusted first-party code or an adversary who has taken control of module execution.

Host functions are the only attack surface across this boundary. A WASM module cannot escalate privileges, access restricted resources, or escape the sandbox through any path except the host functions you have imported to it. This makes host function design a security-critical activity. A host API that is over-scoped, under-validated, or vulnerable to race conditions gives an attacker a direct channel from within the WASM sandbox to host-level capabilities.

Three categories of failure are common in production host function implementations:

Over-broad host API exposure. A plugin host exposes a general filesystem write function to all loaded modules because the trusted first-party module needed it. Untrusted third-party modules loaded into the same runtime inherit the import. The sandbox is intact — the module cannot directly write to the filesystem — but it does not need to. It calls the exposed host function.

Missing or incorrect input validation. WASM passes values into host functions as integers: i32 and i64 values that may represent raw pointers into linear memory, lengths, offsets, enum discriminants, or flags. The host function receives these values without any type information beyond the integer width. A host implementation that dereferences a guest-supplied pointer without first validating that the pointer and length fall within the module’s current linear memory bounds is vulnerable to host memory corruption or information disclosure.

TOCTOU across the WASM/host boundary. A WASM module running in a multi-threaded runtime — or using shared memory — can modify the bytes at a memory address after the host function has validated them but before it has used them. The host reads the pointer, validates the contents, and then reads the contents again to act on them. Between those two reads, the module has changed the data. The host acts on different content than it validated.

Target systems: any Wasmtime, WasmEdge, Wasmer, or Wazero embedder that exposes custom host functions to untrusted or semi-trusted WASM modules; plugin platforms; WASM-based policy engines with host-side I/O; WASM runtimes running user-supplied modules.

Threat Model

  • Over-scoped host API granting unintended capabilities. A platform exposes a log_message(ptr: i32, len: i32) host function for debugging. The implementation formats the string and writes it to a structured log sink. An attacker who controls the WASM module passes a pointer and length covering sensitive host-side data that was placed in shared memory or, more subtly, crafts a format string that the log sink interpolates. The attacker reads data they were not intended to access.

  • Pointer validation bypass via out-of-bounds dereference. A host function read_config(key_ptr: i32, key_len: i32, out_ptr: i32, out_len: i32) reads a configuration value and writes it to the guest buffer at out_ptr. The host reads the key from key_ptr in linear memory and writes the result to out_ptr in linear memory. The attacker passes out_ptr = -1 as an unsigned 32-bit value — 0xFFFFFFFF in the 32-bit address space. If the host does not validate that out_ptr + out_len fits within the module’s current memory bounds, it writes beyond the end of linear memory. In runtimes where linear memory is a heap-allocated buffer, this writes into adjacent host heap memory.

  • TOCTOU string injection. A host function open_file(path_ptr: i32, path_len: i32) accepts a filesystem path from the guest, validates that the path does not escape the allowed directory, and then opens the file. The WASM module runs in a shared-memory configuration. A second thread in the guest overwrites the path bytes in linear memory after the host’s path validation but before the open() syscall. The host opens a file it did not validate.

  • Privilege escalation via unconstrained path parameters. A host function write_file(path_ptr: i32, path_len: i32, data_ptr: i32, data_len: i32) was intended to allow the module to write to a designated scratch directory. The implementation reads the path from linear memory and calls fs::write. No check enforces that the path remains within the scratch directory. A WASM module passes ../../../etc/cron.d/backdoor as the path. The host writes to a path it was not designed to allow.

  • Enum value injection. A host function set_log_level(level: i32) maps the integer to an internal enum with four variants. The implementation casts the integer directly to the enum type without range validation. The attacker passes level = 9999. In Rust, transmuting an out-of-range value to an enum with #[repr(i32)] causes undefined behaviour. In C, it is an invalid enum value that may be used in a switch statement with no default case.

  • Access level: All adversaries require only the ability to execute WASM instructions in the runtime — no memory corruption or sandbox escape required. The attacks proceed through the legitimate host function import mechanism.

  • Objective: Read host memory or files outside the intended scope; write to host filesystem paths not within the allowed directory; inject values that cause undefined behaviour or logic errors in host code; bypass access controls enforced by the host API.

Hardening Configuration

Step 1: Audit Every Host Function Against Module Requirements

Before writing validation code, establish what the module actually needs. Every host function import that a module does not need is attack surface that should be removed. This is the principle of minimal host API exposure applied concretely.

Use wasmparser to list all imports declared by a WASM module before linking it:

use wasmparser::{Parser, Payload};
use std::fs;

fn audit_imports(wasm_path: &str) -> anyhow::Result<()> {
    let wasm = fs::read(wasm_path)?;
    let parser = Parser::new(0);

    for payload in parser.parse_all(&wasm) {
        if let Payload::ImportSection(reader) = payload? {
            for import in reader {
                let import = import?;
                println!("import: {}/{} ({:?})",
                    import.module, import.name, import.ty);
            }
        }
    }
    Ok(())
}

Compare the import list against your known-required set. Any import that is declared by the module but not on the approved list for that module’s trust level should cause the load to fail before the module executes a single instruction. Define a per-module allowlist in your configuration:

# module-policy.toml
[modules.analytics-plugin]
allowed_imports = [
    "env/log_message",
    "env/get_timestamp",
    "env/read_config_key",
]
# NOT allowed: env/write_file, env/execute_command, env/http_fetch

[modules.trusted-core]
allowed_imports = [
    "env/log_message",
    "env/get_timestamp",
    "env/read_config_key",
    "env/write_scratch_file",
    "env/http_fetch",
]

Enforce this at link time by only linking functions the module is approved to use:

use wasmtime::{Engine, Linker, Module, Store};

fn create_restricted_linker(
    engine: &Engine,
    allowed: &[&str],
) -> anyhow::Result<Linker<()>> {
    let mut linker = Linker::new(engine);

    // Only register functions the module is approved to use.
    // An import not registered here will cause instantiation to fail.
    for name in allowed {
        match *name {
            "env/log_message"    => register_log_message(&mut linker)?,
            "env/get_timestamp"  => register_get_timestamp(&mut linker)?,
            "env/read_config_key" => register_read_config_key(&mut linker)?,
            _ => {
                anyhow::bail!("unknown allowed import: {}", name);
            }
        }
    }
    Ok(linker)
}

Wasmtime’s instantiation will return an error if the module declares an import that is not registered in the linker. Do not call linker.allow_unknown_imports(true) — that silently satisfies missing imports with trap functions and makes it impossible to detect over-declared imports.

Step 2: Validate All Pointer Arguments Before Dereferencing

Every host function that accepts a pointer and length pair from a WASM module must validate the region before reading from or writing to it. The validation must be performed against the module’s current memory bounds, not a cached size, because a WASM module can call memory.grow between calls and the current size must be re-checked on each invocation.

Define a helper that performs all pointer validation in one place:

use wasmtime::{Caller, Memory};

/// Read a byte slice from WASM linear memory.
/// Validates that [ptr, ptr+len) is fully within current memory bounds.
/// Returns an error rather than panicking on invalid input.
fn read_guest_bytes<'a>(
    caller: &'a Caller<'_, ()>,
    memory: &Memory,
    ptr: i32,
    len: i32,
) -> anyhow::Result<&'a [u8]> {
    // Reject negative values — WASM i32 is signed; negative values as
    // unsigned offsets would alias large addresses.
    if ptr < 0 || len < 0 {
        anyhow::bail!("invalid pointer or length: ptr={}, len={}", ptr, len);
    }
    let ptr = ptr as usize;
    let len = len as usize;

    // Check for arithmetic overflow in ptr + len.
    let end = ptr.checked_add(len)
        .ok_or_else(|| anyhow::anyhow!("pointer arithmetic overflow"))?;

    let mem_data = memory.data(caller);

    // Verify the region is within current memory size.
    if end > mem_data.len() {
        anyhow::bail!(
            "out-of-bounds memory access: [{}, {}) exceeds memory size {}",
            ptr, end, mem_data.len()
        );
    }

    Ok(&mem_data[ptr..end])
}

/// Write a byte slice to WASM linear memory.
/// Validates that [ptr, ptr+len) is fully within current memory bounds.
fn write_guest_bytes(
    caller: &mut Caller<'_, ()>,
    memory: &Memory,
    ptr: i32,
    len: i32,
    data: &[u8],
) -> anyhow::Result<()> {
    if ptr < 0 || len < 0 {
        anyhow::bail!("invalid write pointer or length");
    }
    let ptr = ptr as usize;
    let len = len as usize;
    if data.len() > len {
        anyhow::bail!("write data ({} bytes) exceeds buffer length ({})", data.len(), len);
    }
    let end = ptr.checked_add(len)
        .ok_or_else(|| anyhow::anyhow!("write pointer arithmetic overflow"))?;

    let mem_data = memory.data_mut(caller);
    if end > mem_data.len() {
        anyhow::bail!("out-of-bounds write: [{}, {}) exceeds memory size {}", ptr, end, mem_data.len());
    }

    mem_data[ptr..ptr + data.len()].copy_from_slice(data);
    Ok(())
}

Apply this to every host function that accepts pointer arguments. A host function that does not use this helper should not compile without explicit justification:

fn register_log_message(linker: &mut Linker<()>) -> anyhow::Result<()> {
    linker.func_wrap(
        "env",
        "log_message",
        |mut caller: Caller<'_, ()>, ptr: i32, len: i32| {
            let memory = caller.get_export("memory")
                .and_then(|e| e.into_memory())
                .ok_or_else(|| anyhow::anyhow!("no memory export"))?;

            // Enforce a maximum length before reading.
            // Prevents the module from causing the host to read
            // arbitrarily large slices even within memory bounds.
            const MAX_LOG_LEN: i32 = 4096;
            if len > MAX_LOG_LEN {
                anyhow::bail!("log_message: len {} exceeds maximum {}", len, MAX_LOG_LEN);
            }

            let bytes = read_guest_bytes(&caller, &memory, ptr, len)?;
            // Validate UTF-8 before logging to prevent log injection.
            let msg = std::str::from_utf8(bytes)
                .map_err(|_| anyhow::anyhow!("log_message: invalid UTF-8"))?;
            // Strip control characters to prevent terminal injection.
            let sanitised: String = msg.chars()
                .filter(|c| !c.is_control() || *c == '\n')
                .collect();
            log::info!(target: "wasm-guest", "{}", sanitised);
            Ok(())
        },
    )?;
    Ok(())
}

Step 3: Copy-Before-Use to Prevent TOCTOU

When a WASM module has access to shared memory (enabled by the threads proposal) or when the host runtime uses multiple threads to service calls into the same module, a second thread can modify linear memory between the time the host validates a value and the time it uses it. The mitigation is to copy the argument bytes from linear memory into host-owned memory before performing any validation or use, and then validate and act on the host-owned copy exclusively.

fn register_open_file(linker: &mut Linker<()>) -> anyhow::Result<()> {
    linker.func_wrap(
        "env",
        "open_file",
        |mut caller: Caller<'_, ()>, path_ptr: i32, path_len: i32| -> anyhow::Result<i32> {
            let memory = caller.get_export("memory")
                .and_then(|e| e.into_memory())
                .ok_or_else(|| anyhow::anyhow!("no memory export"))?;

            // Step 1: Copy the path bytes into host-owned Vec.
            // After this copy, the WASM module cannot change the bytes
            // we are about to validate, even with a shared memory race.
            let raw = read_guest_bytes(&caller, &memory, path_ptr, path_len)?;
            let path_bytes: Vec<u8> = raw.to_vec();   // host-owned copy

            // Step 2: Validate the host-owned copy.
            let path_str = std::str::from_utf8(&path_bytes)
                .map_err(|_| anyhow::anyhow!("open_file: non-UTF-8 path"))?;

            // Canonicalise to resolve .. components before checking prefix.
            // Use the allowed root defined at linker construction time.
            let allowed_root = std::path::Path::new("/var/wasm-scratch");
            let requested = allowed_root.join(path_str);
            let canonical = requested.canonicalize()
                .map_err(|e| anyhow::anyhow!("open_file: cannot canonicalise path: {}", e))?;

            if !canonical.starts_with(allowed_root) {
                anyhow::bail!("open_file: path escape attempt: {:?}", canonical);
            }

            // Step 3: Use the canonical, validated, host-owned path.
            // We do NOT re-read from linear memory here.
            let fd = std::fs::File::open(&canonical)
                .map_err(|e| anyhow::anyhow!("open_file: {}", e))?;

            // Register the file descriptor and return a guest-facing handle.
            // (Handle table management omitted for brevity.)
            let handle_id = register_fd_handle(fd);
            Ok(handle_id)
        },
    )?;
    Ok(())
}

The critical discipline is: after the to_vec() call, never read from path_ptr again. All subsequent operations — validation, canonicalisation, the final open — operate on canonical, which is derived entirely from the host-owned copy.

Step 4: Validate Enum and Flag Arguments

Host functions that accept integer discriminants representing enum values or bitfield flags must validate the range before interpreting the value. An integer that maps to no valid variant must be rejected with an error, not passed to a cast or a switch statement.

#[derive(Debug, Clone, Copy)]
enum Permission {
    Read  = 0,
    Write = 1,
    Exec  = 2,
}

impl TryFrom<i32> for Permission {
    type Error = anyhow::Error;

    fn try_from(v: i32) -> anyhow::Result<Self> {
        match v {
            0 => Ok(Permission::Read),
            1 => Ok(Permission::Write),
            2 => Ok(Permission::Exec),
            _ => anyhow::bail!("invalid Permission discriminant: {}", v),
        }
    }
}

fn register_check_permission(linker: &mut Linker<()>) -> anyhow::Result<()> {
    linker.func_wrap(
        "env",
        "check_permission",
        |_caller: Caller<'_, ()>, perm: i32| -> anyhow::Result<i32> {
            // Validate before use — never transmute or cast directly.
            let permission = Permission::try_from(perm)?;
            let granted = evaluate_permission(permission);
            Ok(if granted { 1 } else { 0 })
        },
    )?;
    Ok(())
}

For bitfield flags, mask against the set of all valid bits before testing any individual bit:

const VALID_FLAG_MASK: i32 = 0b0000_0111;  // bits 0, 1, 2 are defined

fn validate_flags(flags: i32) -> anyhow::Result<i32> {
    if flags & !VALID_FLAG_MASK != 0 {
        anyhow::bail!("flags value 0x{:08x} contains undefined bits", flags);
    }
    Ok(flags)
}

Step 5: Enforce Path Confinement for Filesystem Host Functions

Any host function that accepts a filesystem path must enforce that the resolved path remains within the allowed directory root. Path validation at the string level — checking that the string does not contain .. — is insufficient. The path must be canonicalised after joining with the allowed root to resolve all symlinks and . / .. components, and the canonical path must be tested as a prefix match against the allowed root’s canonical path.

fn confine_path(
    allowed_root: &std::path::Path,
    guest_path: &str,
) -> anyhow::Result<std::path::PathBuf> {
    // Reject absolute paths from the guest immediately.
    // All guest paths must be relative to the allowed root.
    if std::path::Path::new(guest_path).is_absolute() {
        anyhow::bail!("absolute paths are not permitted from guest");
    }

    // Join before canonicalising — canonicalize resolves the full path.
    let joined = allowed_root.join(guest_path);
    let canonical = joined.canonicalize()
        .map_err(|e| anyhow::anyhow!("path canonicalisation failed: {}", e))?;

    // The canonical path must start with the canonical allowed root.
    let canonical_root = allowed_root.canonicalize()?;
    if !canonical.starts_with(&canonical_root) {
        anyhow::bail!(
            "path escape: {:?} is outside allowed root {:?}",
            canonical, canonical_root
        );
    }

    Ok(canonical)
}

One edge case: if the target path does not yet exist, canonicalize() returns an error on most platforms. For write operations targeting new files, canonicalise the parent directory and validate the parent, then append the filename component separately after validating it contains no path separators:

fn confine_new_file_path(
    allowed_root: &std::path::Path,
    guest_path: &str,
) -> anyhow::Result<std::path::PathBuf> {
    let p = std::path::Path::new(guest_path);
    if p.is_absolute() {
        anyhow::bail!("absolute paths not permitted");
    }
    let filename = p.file_name()
        .ok_or_else(|| anyhow::anyhow!("path has no filename component"))?;
    // Filename must not contain path separators.
    if filename.to_string_lossy().contains('/') || filename.to_string_lossy().contains('\\') {
        anyhow::bail!("filename contains path separator");
    }
    let parent = p.parent().unwrap_or(std::path::Path::new(""));
    let joined_parent = allowed_root.join(parent);
    let canonical_parent = joined_parent.canonicalize()
        .map_err(|e| anyhow::anyhow!("parent directory canonicalisation failed: {}", e))?;
    let canonical_root = allowed_root.canonicalize()?;
    if !canonical_parent.starts_with(&canonical_root) {
        anyhow::bail!("path escape in parent directory");
    }
    Ok(canonical_parent.join(filename))
}

Step 6: Use Wasmtime’s Linker API for Safe Host Function Registration

Wasmtime’s Linker API is the correct mechanism for registering host functions. Avoid using lower-level Func::new with raw Val arrays unless there is a specific reason: func_wrap provides statically typed function signatures that prevent type mismatches at the Rust compiler level.

use wasmtime::{Engine, Linker, Store};

fn build_production_linker(engine: &Engine) -> anyhow::Result<Linker<HostState>> {
    let mut linker: Linker<HostState> = Linker::new(engine);

    // func_wrap: types are checked at compile time.
    // The closure signature must match (i32, i32) -> Result<(), _>
    // for a host function declared as (param i32 i32) (result) in WASM.
    linker.func_wrap("env", "log_message",
        |mut caller: Caller<'_, HostState>, ptr: i32, len: i32| -> anyhow::Result<()> {
            // ... validated implementation
            Ok(())
        }
    )?;

    // Explicitly do NOT call:
    // linker.allow_shadowing(true)   — prevents later registration from silently overriding earlier
    // linker.allow_unknown_imports(true)  — would satisfy unregistered imports with traps

    Ok(linker)
}

Store per-module state — capability grants, allowed paths, per-tenant identifiers — in the Store<T> data, not in global state. Accessing caller.data() within a host function returns the per-store state for the calling module’s store, preventing cross-module state leakage:

struct HostState {
    allowed_root: std::path::PathBuf,
    tenant_id:    String,
    call_budget:  u64,
}

// Inside a host function:
let state = caller.data();
let allowed_root = &state.allowed_root;

Step 7: Fuzz Host Functions with WASM Modules as Drivers

Host function validation logic is best tested with fuzzing that generates arbitrary WASM-level argument values, including out-of-range pointers, maximum integer values, negative lengths, and boundary values. Write a WASM module that accepts fuzzer-provided bytes and uses them directly as host function arguments:

// fuzz/fuzz_targets/host_functions.rs
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    // Build the WASM module bytes for a simple fuzzing driver.
    // The driver imports the target host function and calls it
    // with values derived from the fuzzer input.
    let wasm = build_fuzz_driver_wasm(data);

    let engine = wasmtime::Engine::default();
    let module = match wasmtime::Module::new(&engine, &wasm) {
        Ok(m) => m,
        Err(_) => return,  // invalid WASM; skip
    };
    let mut store = wasmtime::Store::new(&engine, HostState::default());
    let linker = build_production_linker(&engine).unwrap();

    // Instantiation and execution must not panic, crash, or corrupt host state.
    // Any error return is acceptable; panics or memory corruption are not.
    let _ = linker.instantiate(&mut store, &module)
        .and_then(|instance| {
            let call = instance.get_typed_func::<(), ()>(&mut store, "fuzz_call")?;
            call.call(&mut store, ())
        });
});

The property under test is that no combination of valid WASM instructions and arbitrary argument values — including values that are individually in range but combine to out-of-bounds regions, or values that trigger integer overflow in arithmetic — causes the host to panic, corrupt state, or return a result to the module that grants unintended access.

Verification

After implementing host function hardening, verify the controls are effective:

# 1. List all imports in a WASM module and compare against the approved set.
wasm-tools print plugin.wasm | grep '(import'

# 2. Validate the module's imports against your policy using wasmparser.
cargo run --bin audit-imports -- plugin.wasm module-policy.toml

# 3. Attempt path escape via a test WASM module that passes ../.. paths.
# Expected: instantiation succeeds; the host function returns an error value.
# Not acceptable: host panic, host crash, or file written outside scratch dir.
cargo test --test integration -- host_function_path_escape

# 4. Run the fuzzer for a minimum period before deploying.
cargo fuzz run host_functions -- -max_total_time=3600

# 5. Confirm unknown imports cause instantiation failure, not silent traps.
# Build a WASM module that imports env/not_a_real_function.
wasm-tools smith --seed 0 --disallow-traps | cargo run --bin check-unknown-imports

What This Does Not Cover

Host function security addresses the WASM-to-host call path. It does not address:

  • Within-WASM sandbox memory safety. A WASM module that is compromised through an in-sandbox buffer overflow can call host functions on the attacker’s behalf using its legitimate capability grants. See WASM Linear Memory Safety.
  • externref and reference type security. Host objects passed as externref values have their own validation requirements at the binding layer. See WASM Reference Types and Host Binding Security.
  • WASI capability scoping. Modules using WASI interfaces receive capabilities through WASI-defined host functions. Configure WASI capability grants using WasiCtxBuilder rather than building ad-hoc equivalents. See Wasmtime Production Hardening.
  • Module supply chain. A malicious module can be designed to probe host function implementations deliberately. Validate module provenance before loading. See WASM OCI Signing.

Summary

Host functions are the complete attack surface for sandbox escape and privilege escalation in WASM-based systems. The key controls are:

  1. Audit imports against a per-module allowlist and refuse instantiation of modules that request functions they are not approved to use.
  2. Validate every pointer and length argument against current linear memory bounds before dereferencing, using a single validated helper function throughout the host codebase.
  3. Copy arguments into host-owned memory before validation and use to prevent TOCTOU races in shared-memory or multi-threaded configurations.
  4. Validate enum discriminants and flag values against their full set of valid values; reject out-of-range inputs before any cast or match.
  5. Enforce path confinement by canonicalising paths after joining with the allowed root, and prefix-matching the canonical result against the canonical root.
  6. Use Wasmtime’s Linker and func_wrap for statically typed host function registration; store per-module state in Store<T> rather than global variables.
  7. Fuzz host functions with WASM modules as drivers to find argument combinations that cause host panics, state corruption, or unexpected capability grants.

The goal is a host API that is as narrow as the module’s genuine requirements, with every entry point treating all guest-supplied values as untrusted input regardless of whether the module is first-party or third-party code.