CDN and Third-Party Script Supply Chain Security: Lessons from polyfill.io

Problem

In June 2024, polyfill.io — a CDN serving JavaScript polyfills to over 100,000 websites — was found to be serving malicious JavaScript. The attack vector was domain acquisition: a Chinese technology firm had purchased the polyfill.io domain and its associated GitHub repository earlier in the year. After acquisition, the CDN infrastructure was modified to inject malicious code into the polyfill bundles being served to end users. The injected code targeted specific user agents and performed redirects to scam and malware sites.

The polyfill.io incident is a clear example of a CDN supply chain attack:

A widely trusted CDN service is used by many websites
Ownership of the domain/service changes hands
The new operator modifies the served content to include malicious code
Every website including the CDN’s scripts now delivers malicious code to users
Detection is delayed because the CDN is trusted and the scripts appear to load from a legitimate, well-known domain

Why <script src="https://cdn.polyfill.io/..."> is dangerous. When a page includes a script from an external CDN, the browser executes that script with the full permissions of the page origin. The script can:

Read and exfiltrate cookies (including session tokens)
Capture form input (credentials, credit card data)
Make authenticated API calls on behalf of the user
Inject additional scripts from other sources
Modify page content to perform phishing

The scope is larger than polyfill.io. The same attack applies to any externally hosted script: analytics libraries (Google Analytics, Matomo), A/B testing tools (Optimizely, LaunchDarkly), customer support widgets (Intercom, Zendesk), payment SDKs (Stripe.js, PayPal), social login buttons (Google, Facebook), CDN-hosted JavaScript frameworks.

Subresource Integrity is the technical control. SRI allows a page to specify the expected cryptographic hash of a linked resource. If the served resource does not match the hash, the browser refuses to execute it. SRI would have blocked the polyfill.io attack: the malicious code had a different hash than the legitimate code.

Target systems: any web application that loads JavaScript from external CDNs; organisations responsible for web application security; e-commerce and financial applications where script injection would be particularly severe.

Threat Model

Adversary 1 — CDN domain acquisition. An attacker acquires a domain serving JavaScript that thousands of websites include. They modify the served script to inject credential-harvesting code. All sites using the CDN now deliver the malicious script to their users. SRI prevents execution if hashes are specified.

Adversary 2 — CDN compromise. An attacker gains administrative access to a CDN provider’s infrastructure (via credential theft, CVE exploitation, or insider threat). They modify the files stored at the CDN. Same outcome as domain acquisition.

Adversary 3 — Subdomain takeover enabling CDN script replacement. A website loads scripts from cdn.example.com. The cdn.example.com subdomain points to a cloud storage bucket or CDN origin that is no longer controlled by the organisation (dangling CNAME). An attacker creates a resource with the same name and serves malicious JavaScript. SRI and CSP prevent execution.

Configuration / Implementation

Step 1 — Inventory all external script dependencies

#!/bin/bash
# scripts/audit-external-scripts.sh
# Find all external JavaScript includes in HTML templates and built output

SEARCH_DIR="${1:-./dist}"

echo "=== External Script Sources ==="
grep -r '<script' "$SEARCH_DIR" 2>/dev/null | \
    grep -oP 'src="https?://[^"]+\.js[^"]*"' | \
    sort -u | \
    while read -r src; do
        domain=$(echo "$src" | grep -oP '(?<=://)[^/]+')
        echo "  $domain: $src"
    done

echo ""
echo "=== External Script Sources Without SRI ==="
# Find script tags with external src but no integrity attribute
python3 - << 'PYEOF'
import re
import sys
import glob

SCRIPT_PATTERN = re.compile(
    r'<script[^>]*src=["\']https?://[^"\']+["\'][^>]*>',
    re.IGNORECASE
)
SRI_PATTERN = re.compile(r'integrity=["\'][^"\']+["\']', re.IGNORECASE)

for filepath in glob.glob("dist/**/*.html", recursive=True) + glob.glob("public/**/*.html", recursive=True):
    with open(filepath) as f:
        content = f.read()
    
    for match in SCRIPT_PATTERN.finditer(content):
        tag = match.group(0)
        if not SRI_PATTERN.search(tag):
            print(f"  NO SRI: {filepath}: {tag[:100]}")
PYEOF

Step 2 — Add Subresource Integrity to external script tags

<!-- BEFORE: Vulnerable to CDN supply chain attack -->
<script src="https://cdn.polyfill.io/v3/polyfill.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"></script>

<!-- AFTER: Protected with Subresource Integrity -->
<!-- Generate hash: openssl dgst -sha384 -binary lodash.min.js | openssl base64 -A -->
<!-- Or use: https://www.srihash.org/ -->
<script
  src="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"
  integrity="sha384-UE8+wMKx0UdM3YN6GJWDo5RUi5JWZQObZAJUAM9m1cGJKPeZXjpTH7Z6Dz+lzn5"
  crossorigin="anonymous">
</script>

<!-- For scripts where you cannot guarantee a stable hash (dynamic CDN) -->
<!-- The solution is self-hosting, not SRI -->

#!/bin/bash
# scripts/generate-sri-hash.sh
# Generate an SRI hash for a JavaScript file

URL="${1:?Usage: $0 <url>}"
TEMP_FILE=$(mktemp)

curl -s "$URL" -o "$TEMP_FILE"
HASH=$(openssl dgst -sha384 -binary "$TEMP_FILE" | openssl base64 -A)
echo "integrity=\"sha384-${HASH}\""

rm "$TEMP_FILE"

Step 3 — Implement a strict Content Security Policy

# Content-Security-Policy that restricts script execution

# Most restrictive: no external scripts at all (self-hosted only)
Content-Security-Policy: script-src 'self'; object-src 'none'; base-uri 'self'

# Allow specific trusted external scripts with hash constraints
Content-Security-Policy: 
  default-src 'self';
  script-src 
    'self'
    'sha384-UE8+wMKx0UdM3YN6GJWDo5RUi5JWZQObZAJUAM9m1cGJKPeZXjpTH7Z6Dz+lzn5'
    https://js.stripe.com;
  script-src-attr 'none';
  object-src 'none';
  base-uri 'self';
  require-trusted-types-for 'script';
  report-uri https://your-csp-report-endpoint.example.com/csp-violations

# /etc/nginx/conf.d/csp-headers.conf
# Apply CSP headers for all pages

add_header Content-Security-Policy "
  default-src 'self';
  script-src 'self'
    'sha384-HASH_OF_EACH_INLINE_SCRIPT'
    https://js.stripe.com
    https://www.googletagmanager.com;
  style-src 'self' 'unsafe-inline' https://fonts.googleapis.com;
  font-src 'self' https://fonts.gstatic.com;
  img-src 'self' data: https:;
  connect-src 'self' https://api.example.com;
  frame-src https://js.stripe.com;
  object-src 'none';
  base-uri 'self';
  form-action 'self';
  upgrade-insecure-requests;
  report-uri /api/csp-report
" always;

Step 4 — Self-host critical external scripts

#!/bin/bash
# scripts/vendor-external-scripts.sh
# Download and vendor external JavaScript files to eliminate CDN dependency

VENDOR_DIR="public/vendor"
mkdir -p "$VENDOR_DIR"

# Define external scripts to vendor
declare -A SCRIPTS=(
    ["lodash.min.js"]="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"
    ["react.production.min.js"]="https://cdn.jsdelivr.net/npm/react@18.3.1/umd/react.production.min.js"
)

for filename in "${!SCRIPTS[@]}"; do
    url="${SCRIPTS[$filename]}"
    output="$VENDOR_DIR/$filename"
    
    echo "Downloading: $url"
    curl -fsSL "$url" -o "$output"
    
    # Generate and store the SRI hash for the downloaded file
    HASH=$(openssl dgst -sha384 -binary "$output" | openssl base64 -A)
    echo "sha384-$HASH" > "$output.sri"
    echo "  SRI hash stored: $output.sri"
    
    # Record the source URL and download date for future audits
    cat > "$output.meta" << METAEOF
source_url: $url
downloaded: $(date -u +%Y-%m-%dT%H:%M:%SZ)
sri_hash: sha384-$HASH
METAEOF
done

echo ""
echo "Vendor directory contents:"
ls -la "$VENDOR_DIR"
echo ""
echo "Update HTML templates to reference /vendor/$filename instead of external CDN URLs"

Step 5 — Automate external script monitoring for changes

#!/usr/bin/env python3
# scripts/external-script-monitor.py
# Monitor external scripts for unexpected changes
# Alert when a script's hash changes between checks

import hashlib
import json
import urllib.request
from datetime import datetime, timezone
from pathlib import Path

STATE_FILE = Path("/var/lib/script-monitor/hashes.json")
ALERT_WEBHOOK = ""  # Set to Slack/Teams webhook

# Scripts to monitor — add all external scripts your sites use
MONITORED_SCRIPTS = [
    {
        "name": "Stripe.js",
        "url": "https://js.stripe.com/v3/",
        "expected_domains": ["js.stripe.com"],
        "high_value": True  # Payment processing — alert immediately on change
    },
    {
        "name": "Google Tag Manager",
        "url": "https://www.googletagmanager.com/gtm.js?id=GTM-XXXX",
        "expected_domains": ["www.googletagmanager.com"],
        "high_value": False
    },
]

def fetch_and_hash(url: str) -> tuple[str, int]:
    """Fetch a URL and return its SHA-384 hash and size."""
    with urllib.request.urlopen(url, timeout=15) as resp:
        content = resp.read()
    return (
        "sha384-" + hashlib.sha384(content).hexdigest(),
        len(content)
    )

def load_state() -> dict:
    try:
        return json.loads(STATE_FILE.read_text())
    except (FileNotFoundError, json.JSONDecodeError):
        return {}

def save_state(state: dict):
    STATE_FILE.parent.mkdir(parents=True, exist_ok=True)
    STATE_FILE.write_text(json.dumps(state, indent=2))

def check_scripts():
    state = load_state()
    alerts = []
    
    for script in MONITORED_SCRIPTS:
        name = script["name"]
        url = script["url"]
        
        try:
            current_hash, size = fetch_and_hash(url)
            previous = state.get(url, {})
            
            if previous and previous.get("hash") != current_hash:
                alerts.append({
                    "name": name,
                    "url": url,
                    "previous_hash": previous.get("hash"),
                    "current_hash": current_hash,
                    "previous_size": previous.get("size"),
                    "current_size": size,
                    "high_value": script.get("high_value", False)
                })
            
            state[url] = {
                "hash": current_hash,
                "size": size,
                "last_checked": datetime.now(timezone.utc).isoformat()
            }
        except Exception as e:
            print(f"Warning: Failed to check {name}: {e}")
    
    save_state(state)
    
    for alert in alerts:
        severity = "CRITICAL" if alert["high_value"] else "WARNING"
        print(f"{severity}: {alert['name']} hash changed!")
        print(f"  URL: {alert['url']}")
        print(f"  Previous: {alert['previous_hash']}")
        print(f"  Current:  {alert['current_hash']}")

if __name__ == "__main__":
    check_scripts()

Expected Behaviour

Scenario	Without SRI/CSP	With SRI + CSP
CDN domain acquired; malicious script served	Browser executes malicious JavaScript; users attacked	SRI hash mismatch; browser refuses to execute script
CDN serves dynamic content; hash cannot be precomputed	SRI is not possible for dynamic scripts	Script is self-hosted; no CDN dependency
CSP blocks unexpected script source	No restriction; any script executes	CSP blocks script from unlisted source; violation reported
CDN script changes legitimately (library update)	No detection; new version runs	SRI hash mismatch; page breaks (intentional — requires deliberate hash update)
Script change monitoring detects hash change	No monitoring; silent change	Alert fires within monitoring interval

Trade-offs

Aspect	Benefit	Cost	Mitigation
SRI on external scripts	Prevents modified script execution	Page breaks if script changes (even legitimately)	Treat SRI hash update as a required step in library upgrade workflow
Strict CSP	Prevents execution of unlisted scripts	Breaks inline scripts and some third-party integrations	Migrate inline scripts to files; audit and list all legitimate sources
Self-hosting external scripts	Eliminates CDN supply chain risk entirely	Must manage updates; may miss security patches	Automate update checks; include in vulnerability scanning pipeline
Script change monitoring	Early detection of CDN compromise	False alerts on legitimate updates	Monitor critical scripts (payment, auth) only; set longer check interval for low-risk scripts

Failure Modes

Failure	Symptom	Detection	Recovery
SRI hash outdated after library update	Page fails to load scripts; JavaScript errors in console	Browser console errors; CSP violation reports	Regenerate SRI hash for new version; update HTML template
CSP blocks legitimate analytics	Analytics stops collecting; business team reports data gap	Missing data in analytics dashboard	Add analytics domain to CSP script-src; test before deploying
Self-hosted script not updated after security release	Running outdated version with known CVEs	Dependency scanner finds vulnerable version	Add vendored script to vulnerability scanning pipeline; alert on new version releases
Script monitor produces false alerts on CDN cache purge	Alert fires on minor CDN infrastructure changes	Alert fires but hash difference is trivial	Compare hash over multiple checks before alerting; use multiple check endpoints

Software Supply Chain Third-Party Risk — broader supply chain risk management beyond CDN scripts
Dependency Confusion Defence — the npm/PyPI equivalent of CDN supply chain attacks
HTTP Security Headers — implementing CSP and other security headers on web applications
SBOM Generation and Consumption — tracking all dependencies including vendored scripts
AI Coding Supply Chain Risk — supply chain risk in AI-generated code that may introduce external dependencies