Tamper-Evident AI Decision Logs Using Wasm Runtime Attestation

Problem

As AI systems make decisions with regulatory and legal consequences — credit approvals, healthcare triage, insurance underwriting, content moderation, fraud detection — the question of provenance becomes critical: can you prove, after the fact, exactly which model made which decision on which input?

The conventional audit log approach records decision outputs alongside timestamps and identifiers. This is necessary but not sufficient. A conventional log can be falsified: the output can be changed after the fact, the claimed model version can be incorrect, or the input can be silently modified before logging. None of this falsification is detectable from the log itself.

The problem has three distinct components:

Model identity. “Model v3.2 made this decision” is a claim. Without cryptographic binding between the model artifact and the decision record, this claim can be changed. Model weights can be swapped, model version metadata can be edited, and log entries can be backdated. You can claim any model made any decision.

Input integrity. The input to an AI model is often a processed representation of raw data — a feature vector, a tokenised text, a normalised image. The raw-to-processed transformation is typically undocumented in decision logs. An audit that shows “input: [0.2, 0.8, 1.0, …]” cannot be traced back to the original customer record without additional infrastructure.

Execution determinism. Even with the same model and input, floating-point non-determinism, hardware differences, and runtime configuration can produce different outputs. A claim that “this specific output was produced by this specific model on this specific input” requires deterministic execution — otherwise, reproducing the inference only proves what the model would produce now, not what it produced at decision time.

Wasm addresses all three. A Wasm module is content-addressed — its SHA-256 hash uniquely identifies its code. Wasm execution is deterministic for non-floating-point operations when WASI randomness is seeded. And Wasm module signing with Sigstore, COSE, or similar schemes creates a verifiable link between a module hash and a signing identity.

The architecture is: package the AI inference function as a Wasm module, sign it, record the module hash and signature in the decision log alongside the input hash and output, and store the signed execution record in an append-only log. Any future audit can verify that the logged module hash corresponds to the signed module, that the signed module was produced by the expected signing key, and that the input hash matches the recorded input. This is not just an audit log — it is a cryptographic proof of execution.

The constraint: this architecture applies to inference functions that can be packaged as Wasm. ONNX runtime, TensorFlow Lite, and custom feature-computation functions can all be compiled to Wasm32. Large GPU-based LLM inference cannot (currently) run in Wasm, but the feature extraction, pre-processing, and post-processing stages that determine the effective input and output can.

Target systems: regulated industries requiring AI decision auditability (financial services, healthcare, insurance, HR); organisations subject to GDPR Article 22 automated decision-making requirements; EU AI Act Article 13 transparency obligations; any deployment where “which model made this decision on what input” needs to be provable.

Threat Model

The adversarial scenario this architecture defends against is post-hoc falsification of AI decision records by insiders:

Adversary 1 — Retroactive model version manipulation. An AI system makes a biased credit decision. A regulator requests the model version and decision log. An insider changes the model version field in the database from “v1.2 (unvalidated)” to “v2.0 (validated)”. With conventional logging: undetectable. With Wasm attestation: the module hash in the log does not match any signed version; falsification is detectable.

Adversary 2 — Input sanitisation post-hoc. An AI makes a hiring decision based on a feature vector that included a protected attribute. After the decision, someone modifies the stored input to remove the protected attribute. The log now shows the decision was made on compliant features. With Wasm attestation: input hash in the log does not match the modified input; tampering is detectable.

Adversary 3 — Unverified model deployment. A production model is replaced with an unvalidated version. The system continues to log a model version identifier, but the actual running code is different. Decisions made by the unvalidated model are attributed to the validated one. With module hash in the attestation: the hash does not match the validated module’s hash; substitution is detectable.

Adversary 4 — Log deletion or modification. The append-only log containing attestation records is modified to remove unfavourable decisions. With a transparency log (Rekor-style): each record is merkle-tree committed; removal is detectable.

Configuration / Implementation

Step 1 — Package inference preprocessing as a Wasm module

The key insight: you don’t need to run GPU inference in Wasm. Package the components that can be Wasm-compiled — feature extraction, input normalisation, rule application, post-processing:

// feature_extractor/src/lib.rs
// Deterministic feature extraction packaged as a Wasm module
// This is what we attest, not the GPU inference step

use wasm_bindgen::prelude::*;
use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize)]
pub struct RawInput {
    pub customer_id: String,
    pub features: Vec<f32>,
    pub timestamp: i64,
}

#[derive(Serialize, Deserialize)]
pub struct ProcessedFeatures {
    pub normalised: Vec<f32>,
    pub feature_names: Vec<String>,
    pub input_hash: String,  // SHA-256 of RawInput serialised
}

#[wasm_bindgen]
pub fn extract_features(raw_input_json: &str) -> String {
    let input: RawInput = serde_json::from_str(raw_input_json)
        .expect("Invalid input");
    
    // Deterministic normalisation
    let normalised: Vec<f32> = input.features.iter()
        .map(|&f| (f - 0.5) / 0.25)  // z-score with known population stats
        .collect();
    
    // Hash the input for integrity verification
    use sha2::{Sha256, Digest};
    let input_hash = hex::encode(
        Sha256::digest(raw_input_json.as_bytes())
    );
    
    serde_json::to_string(&ProcessedFeatures {
        normalised,
        feature_names: vec!["age".to_string(), "income".to_string(), "score".to_string()],
        input_hash,
    }).unwrap()
}

Compile to Wasm:

# Build the Wasm module
wasm-pack build --target nodejs --release

# Sign the module
# Using cosign for COSE signing
DIGEST=$(sha256sum pkg/feature_extractor_bg.wasm | cut -d' ' -f1)
echo "Module digest: $DIGEST"

# Sign with Sigstore keyless (requires OIDC token)
cosign sign-blob \
  --bundle feature_extractor.bundle \
  pkg/feature_extractor_bg.wasm

# Or sign with a local key
cosign sign-blob \
  --key cosign.key \
  --output-signature feature_extractor.sig \
  pkg/feature_extractor_bg.wasm

Step 2 — Build the attestation record structure

// attestation/src/lib.rs

use sha2::{Sha256, Digest};
use serde::{Deserialize, Serialize};
use chrono::Utc;

#[derive(Serialize, Deserialize, Clone)]
pub struct WasmExecutionAttestation {
    /// Version of this attestation format
    pub schema_version: u8,
    
    /// Timestamp of execution (Unix seconds)
    pub executed_at: i64,
    
    /// SHA-256 of the Wasm module binary that executed
    /// This binds the decision to a specific, verifiable module
    pub module_hash: String,
    
    /// Reference to the module signing bundle (Sigstore/COSE)
    pub module_signature_ref: String,
    
    /// SHA-256 of the raw input before any processing
    /// Stored separately from processed features
    pub raw_input_hash: String,
    
    /// SHA-256 of the processed features (output of Wasm module)
    pub processed_input_hash: String,
    
    /// SHA-256 of the model artifact used for inference
    /// For external inference (GPU LLM), this is the model weights hash
    pub model_artifact_hash: String,
    
    /// The decision output (not the full response — just what matters for audit)
    pub decision: serde_json::Value,
    
    /// SHA-256 of this entire record (self-referential integrity check)
    pub record_hash: Option<String>,
}

impl WasmExecutionAttestation {
    pub fn new(
        module_hash: String,
        module_signature_ref: String,
        raw_input: &str,
        processed_input: &str,
        model_artifact_hash: String,
        decision: serde_json::Value,
    ) -> Self {
        let mut record = Self {
            schema_version: 1,
            executed_at: Utc::now().timestamp(),
            module_hash,
            module_signature_ref,
            raw_input_hash: hex::encode(Sha256::digest(raw_input.as_bytes())),
            processed_input_hash: hex::encode(Sha256::digest(processed_input.as_bytes())),
            model_artifact_hash,
            decision,
            record_hash: None,
        };
        
        // Self-referential hash (without record_hash field)
        let serialised = serde_json::to_string(&record).unwrap();
        record.record_hash = Some(hex::encode(Sha256::digest(serialised.as_bytes())));
        record
    }
    
    pub fn verify(&self, module_bytes: &[u8]) -> Result<(), String> {
        // Verify module hash matches the actual module
        let actual_hash = hex::encode(Sha256::digest(module_bytes));
        if actual_hash != self.module_hash {
            return Err(format!(
                "Module hash mismatch: record says {} but actual is {}",
                self.module_hash, actual_hash
            ));
        }
        Ok(())
    }
}

Step 3 — Integrate with an append-only transparency log

# decision_log.py
# Store attestation records in an append-only log
# Using Rekor (Sigstore's transparency log) or a local equivalent

import hashlib
import json
import requests
from datetime import datetime

class AppendOnlyDecisionLog:
    """
    Stores AI decision attestations in a tamper-evident append-only log.
    Uses Rekor for public accountability or a local equivalent for private deployments.
    """
    
    REKOR_URL = "https://rekor.sigstore.dev"
    
    def __init__(self, use_rekor: bool = False, local_log_path: str = "/var/log/ai-decisions"):
        self.use_rekor = use_rekor
        self.local_log_path = local_log_path
    
    def append(self, attestation: dict) -> str:
        """Append an attestation record and return the log entry ID."""
        serialised = json.dumps(attestation, sort_keys=True)
        record_hash = hashlib.sha256(serialised.encode()).hexdigest()
        
        if self.use_rekor:
            return self._append_to_rekor(serialised, record_hash)
        else:
            return self._append_to_local_log(serialised, record_hash)
    
    def _append_to_rekor(self, serialised: str, record_hash: str) -> str:
        """Submit attestation to Sigstore Rekor for public accountability."""
        import base64
        
        response = requests.post(
            f"{self.REKOR_URL}/api/v1/log/entries",
            json={
                "apiVersion": "0.0.1",
                "kind": "hashedrekord",
                "spec": {
                    "data": {
                        "hash": {
                            "algorithm": "sha256",
                            "value": record_hash
                        }
                    },
                    "signature": {
                        "content": base64.b64encode(serialised.encode()).decode(),
                        "publicKey": {"content": "YOUR_PUBLIC_KEY_B64"}
                    }
                }
            }
        )
        return response.json().get("uuid", record_hash)
    
    def _append_to_local_log(self, serialised: str, record_hash: str) -> str:
        """Append to local log with chained hashes (merkle-like structure)."""
        import os
        
        # Read last entry hash for chaining
        last_hash = "0" * 64  # Genesis hash
        log_file = f"{self.local_log_path}/decisions.jsonl"
        
        if os.path.exists(log_file):
            with open(log_file, 'rb') as f:
                # Get last non-empty line
                lines = f.read().split(b'\n')
                for line in reversed(lines):
                    if line:
                        try:
                            last_entry = json.loads(line)
                            last_hash = last_entry.get("entry_hash", last_hash)
                            break
                        except json.JSONDecodeError:
                            continue
        
        # Chain this entry to the previous
        entry = {
            "timestamp": datetime.utcnow().isoformat(),
            "record_hash": record_hash,
            "previous_hash": last_hash,
            "attestation": json.loads(serialised),
        }
        
        # Entry hash includes the chain
        entry_hash = hashlib.sha256(json.dumps(entry, sort_keys=True).encode()).hexdigest()
        entry["entry_hash"] = entry_hash
        
        with open(log_file, 'a') as f:
            f.write(json.dumps(entry) + '\n')
        
        return entry_hash
    
    def verify_chain(self) -> tuple[bool, list[str]]:
        """Verify the entire decision log chain has not been tampered with."""
        log_file = f"{self.local_log_path}/decisions.jsonl"
        errors = []
        prev_hash = "0" * 64
        
        with open(log_file, 'r') as f:
            for line_num, line in enumerate(f):
                if not line.strip():
                    continue
                entry = json.loads(line)
                
                if entry.get("previous_hash") != prev_hash:
                    errors.append(f"Chain break at line {line_num}: expected {prev_hash}, got {entry.get('previous_hash')}")
                
                prev_hash = entry.get("entry_hash", "")
        
        return len(errors) == 0, errors

Step 4 — Audit a historical decision

def audit_decision(decision_id: str, claimed_module_path: str, claimed_input: str) -> dict:
    """
    Verify that a historical decision was made by the claimed module on the claimed input.
    """
    # Load the attestation record
    attestation = load_attestation_from_log(decision_id)
    
    # Verify 1: Module hash matches claimed module
    with open(claimed_module_path, 'rb') as f:
        module_bytes = f.read()
    actual_module_hash = hashlib.sha256(module_bytes).hexdigest()
    
    module_match = actual_module_hash == attestation["module_hash"]
    
    # Verify 2: Input hash matches claimed input
    actual_input_hash = hashlib.sha256(claimed_input.encode()).hexdigest()
    input_match = actual_input_hash == attestation["raw_input_hash"]
    
    # Verify 3: Module signature is valid
    sig_valid = verify_cosign_signature(
        claimed_module_path,
        attestation["module_signature_ref"]
    )
    
    return {
        "decision_id": decision_id,
        "module_hash_match": module_match,
        "input_hash_match": input_match,
        "signature_valid": sig_valid,
        "all_verified": module_match and input_match and sig_valid,
        "decision": attestation["decision"],
        "executed_at": attestation["executed_at"],
    }

Expected Behaviour

Audit scenario	Without Wasm attestation	With Wasm attestation
Claim: “model v2.0 made this decision”	Cannot verify; trust the claim	Verify: module hash in log matches signed v2.0 module
Input modified after decision	Cannot detect	Input hash in log does not match modified input
Model swapped in production	Undetectable	Module hash differs from signed approved module
Log entry deleted	Undetectable	Chain break detected by `verify_chain()`
Regulator requests proof of compliant model	Present log entry (unverified)	Present attestation bundle + module signature + transparency log inclusion

Trade-offs

Aspect	Benefit	Cost	Mitigation
Wasm compilation requirement for feature extraction	Strong attestation properties	Not all ML preprocessing can be compiled to Wasm	Attest the components that can be compiled; use model artifact hash for GPU inference component
Transparency log (Rekor)	Public accountability; tamper evidence	Publicly reveals that you made specific numbers of decisions	Use private Rekor deployment for sensitive data; or use local chained log
Record includes decision output	Enables retrospective auditing	Stores potentially sensitive decision outputs	Apply data minimisation: log decision class, not full output; retain full output in encrypted separate store
Deterministic Wasm execution	Reproducible verification	Floating-point operations are non-deterministic across hardware	Use integer arithmetic where possible in feature extraction; document non-deterministic operations

Failure Modes

Failure	Symptom	Detection	Recovery
Module rebuilt without updating attestation references	Audit shows module hash mismatch	Automated module hash check fails on deployment	Integrate module signing into CI/CD pipeline; block deployment if signing fails
Wasm feature extraction non-deterministic	Reproducing decision from log produces different features	Audit verification fails for a known-correct record	Identify and fix non-deterministic operations; re-attest affected historical records with a supplementary note
Log corruption	`verify_chain()` reports chain breaks	Routine log integrity check (run daily)	Restore from backup; the chain break itself is evidence of tampering for the audit trail
Append-only log grows unbounded	Storage cost becomes prohibitive	Monitor log volume; alert when approaching storage quota	Implement log archival with Merkle proof preservation; archive old entries to cold storage while preserving verifiability

Wasm Module Signing with COSE — the module signing infrastructure that provides the cryptographic binding in attestation records
Wasm Runtime Attestation — broader Wasm runtime attestation for verifying execution environment integrity
Wasm SBOM and Provenance — SBOM-based provenance for the Wasm modules used in AI inference attestation
AI Model Cards — model documentation that complements execution attestation by recording model lineage and validation status
Auditing AI Actions — the broader audit framework for AI system accountability that execution attestation is a component of