CDN and Third-Party Script Supply Chain Security: Lessons from polyfill.io

CDN and Third-Party Script Supply Chain Security: Lessons from polyfill.io

Problem

In June 2024, polyfill.io — a CDN serving JavaScript polyfills to over 100,000 websites — was found to be serving malicious JavaScript. The attack vector was domain acquisition: a Chinese technology firm had purchased the polyfill.io domain and its associated GitHub repository earlier in the year. After acquisition, the CDN infrastructure was modified to inject malicious code into the polyfill bundles being served to end users. The injected code targeted specific user agents and performed redirects to scam and malware sites.

The polyfill.io incident is a clear example of a CDN supply chain attack:

  1. A widely trusted CDN service is used by many websites
  2. Ownership of the domain/service changes hands
  3. The new operator modifies the served content to include malicious code
  4. Every website including the CDN’s scripts now delivers malicious code to users
  5. Detection is delayed because the CDN is trusted and the scripts appear to load from a legitimate, well-known domain

Why <script src="https://cdn.polyfill.io/..."> is dangerous. When a page includes a script from an external CDN, the browser executes that script with the full permissions of the page origin. The script can:

  • Read and exfiltrate cookies (including session tokens)
  • Capture form input (credentials, credit card data)
  • Make authenticated API calls on behalf of the user
  • Inject additional scripts from other sources
  • Modify page content to perform phishing

The scope is larger than polyfill.io. The same attack applies to any externally hosted script: analytics libraries (Google Analytics, Matomo), A/B testing tools (Optimizely, LaunchDarkly), customer support widgets (Intercom, Zendesk), payment SDKs (Stripe.js, PayPal), social login buttons (Google, Facebook), CDN-hosted JavaScript frameworks.

Subresource Integrity is the technical control. SRI allows a page to specify the expected cryptographic hash of a linked resource. If the served resource does not match the hash, the browser refuses to execute it. SRI would have blocked the polyfill.io attack: the malicious code had a different hash than the legitimate code.

Target systems: any web application that loads JavaScript from external CDNs; organisations responsible for web application security; e-commerce and financial applications where script injection would be particularly severe.


Threat Model

Adversary 1 — CDN domain acquisition. An attacker acquires a domain serving JavaScript that thousands of websites include. They modify the served script to inject credential-harvesting code. All sites using the CDN now deliver the malicious script to their users. SRI prevents execution if hashes are specified.

Adversary 2 — CDN compromise. An attacker gains administrative access to a CDN provider’s infrastructure (via credential theft, CVE exploitation, or insider threat). They modify the files stored at the CDN. Same outcome as domain acquisition.

Adversary 3 — Subdomain takeover enabling CDN script replacement. A website loads scripts from cdn.example.com. The cdn.example.com subdomain points to a cloud storage bucket or CDN origin that is no longer controlled by the organisation (dangling CNAME). An attacker creates a resource with the same name and serves malicious JavaScript. SRI and CSP prevent execution.


Configuration / Implementation

Step 1 — Inventory all external script dependencies

#!/bin/bash
# scripts/audit-external-scripts.sh
# Find all external JavaScript includes in HTML templates and built output

SEARCH_DIR="${1:-./dist}"

echo "=== External Script Sources ==="
grep -r '<script' "$SEARCH_DIR" 2>/dev/null | \
    grep -oP 'src="https?://[^"]+\.js[^"]*"' | \
    sort -u | \
    while read -r src; do
        domain=$(echo "$src" | grep -oP '(?<=://)[^/]+')
        echo "  $domain: $src"
    done

echo ""
echo "=== External Script Sources Without SRI ==="
# Find script tags with external src but no integrity attribute
python3 - << 'PYEOF'
import re
import sys
import glob

SCRIPT_PATTERN = re.compile(
    r'<script[^>]*src=["\']https?://[^"\']+["\'][^>]*>',
    re.IGNORECASE
)
SRI_PATTERN = re.compile(r'integrity=["\'][^"\']+["\']', re.IGNORECASE)

for filepath in glob.glob("dist/**/*.html", recursive=True) + glob.glob("public/**/*.html", recursive=True):
    with open(filepath) as f:
        content = f.read()
    
    for match in SCRIPT_PATTERN.finditer(content):
        tag = match.group(0)
        if not SRI_PATTERN.search(tag):
            print(f"  NO SRI: {filepath}: {tag[:100]}")
PYEOF

Step 2 — Add Subresource Integrity to external script tags

<!-- BEFORE: Vulnerable to CDN supply chain attack -->
<script src="https://cdn.polyfill.io/v3/polyfill.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"></script>

<!-- AFTER: Protected with Subresource Integrity -->
<!-- Generate hash: openssl dgst -sha384 -binary lodash.min.js | openssl base64 -A -->
<!-- Or use: https://www.srihash.org/ -->
<script
  src="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"
  integrity="sha384-UE8+wMKx0UdM3YN6GJWDo5RUi5JWZQObZAJUAM9m1cGJKPeZXjpTH7Z6Dz+lzn5"
  crossorigin="anonymous">
</script>

<!-- For scripts where you cannot guarantee a stable hash (dynamic CDN) -->
<!-- The solution is self-hosting, not SRI -->
#!/bin/bash
# scripts/generate-sri-hash.sh
# Generate an SRI hash for a JavaScript file

URL="${1:?Usage: $0 <url>}"
TEMP_FILE=$(mktemp)

curl -s "$URL" -o "$TEMP_FILE"
HASH=$(openssl dgst -sha384 -binary "$TEMP_FILE" | openssl base64 -A)
echo "integrity=\"sha384-${HASH}\""

rm "$TEMP_FILE"

Step 3 — Implement a strict Content Security Policy

# Content-Security-Policy that restricts script execution

# Most restrictive: no external scripts at all (self-hosted only)
Content-Security-Policy: script-src 'self'; object-src 'none'; base-uri 'self'

# Allow specific trusted external scripts with hash constraints
Content-Security-Policy: 
  default-src 'self';
  script-src 
    'self'
    'sha384-UE8+wMKx0UdM3YN6GJWDo5RUi5JWZQObZAJUAM9m1cGJKPeZXjpTH7Z6Dz+lzn5'
    https://js.stripe.com;
  script-src-attr 'none';
  object-src 'none';
  base-uri 'self';
  require-trusted-types-for 'script';
  report-uri https://your-csp-report-endpoint.example.com/csp-violations
# /etc/nginx/conf.d/csp-headers.conf
# Apply CSP headers for all pages

add_header Content-Security-Policy "
  default-src 'self';
  script-src 'self'
    'sha384-HASH_OF_EACH_INLINE_SCRIPT'
    https://js.stripe.com
    https://www.googletagmanager.com;
  style-src 'self' 'unsafe-inline' https://fonts.googleapis.com;
  font-src 'self' https://fonts.gstatic.com;
  img-src 'self' data: https:;
  connect-src 'self' https://api.example.com;
  frame-src https://js.stripe.com;
  object-src 'none';
  base-uri 'self';
  form-action 'self';
  upgrade-insecure-requests;
  report-uri /api/csp-report
" always;

Step 4 — Self-host critical external scripts

#!/bin/bash
# scripts/vendor-external-scripts.sh
# Download and vendor external JavaScript files to eliminate CDN dependency

VENDOR_DIR="public/vendor"
mkdir -p "$VENDOR_DIR"

# Define external scripts to vendor
declare -A SCRIPTS=(
    ["lodash.min.js"]="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"
    ["react.production.min.js"]="https://cdn.jsdelivr.net/npm/react@18.3.1/umd/react.production.min.js"
)

for filename in "${!SCRIPTS[@]}"; do
    url="${SCRIPTS[$filename]}"
    output="$VENDOR_DIR/$filename"
    
    echo "Downloading: $url"
    curl -fsSL "$url" -o "$output"
    
    # Generate and store the SRI hash for the downloaded file
    HASH=$(openssl dgst -sha384 -binary "$output" | openssl base64 -A)
    echo "sha384-$HASH" > "$output.sri"
    echo "  SRI hash stored: $output.sri"
    
    # Record the source URL and download date for future audits
    cat > "$output.meta" << METAEOF
source_url: $url
downloaded: $(date -u +%Y-%m-%dT%H:%M:%SZ)
sri_hash: sha384-$HASH
METAEOF
done

echo ""
echo "Vendor directory contents:"
ls -la "$VENDOR_DIR"
echo ""
echo "Update HTML templates to reference /vendor/$filename instead of external CDN URLs"

Step 5 — Automate external script monitoring for changes

#!/usr/bin/env python3
# scripts/external-script-monitor.py
# Monitor external scripts for unexpected changes
# Alert when a script's hash changes between checks

import hashlib
import json
import urllib.request
from datetime import datetime, timezone
from pathlib import Path

STATE_FILE = Path("/var/lib/script-monitor/hashes.json")
ALERT_WEBHOOK = ""  # Set to Slack/Teams webhook

# Scripts to monitor — add all external scripts your sites use
MONITORED_SCRIPTS = [
    {
        "name": "Stripe.js",
        "url": "https://js.stripe.com/v3/",
        "expected_domains": ["js.stripe.com"],
        "high_value": True  # Payment processing — alert immediately on change
    },
    {
        "name": "Google Tag Manager",
        "url": "https://www.googletagmanager.com/gtm.js?id=GTM-XXXX",
        "expected_domains": ["www.googletagmanager.com"],
        "high_value": False
    },
]

def fetch_and_hash(url: str) -> tuple[str, int]:
    """Fetch a URL and return its SHA-384 hash and size."""
    with urllib.request.urlopen(url, timeout=15) as resp:
        content = resp.read()
    return (
        "sha384-" + hashlib.sha384(content).hexdigest(),
        len(content)
    )

def load_state() -> dict:
    try:
        return json.loads(STATE_FILE.read_text())
    except (FileNotFoundError, json.JSONDecodeError):
        return {}

def save_state(state: dict):
    STATE_FILE.parent.mkdir(parents=True, exist_ok=True)
    STATE_FILE.write_text(json.dumps(state, indent=2))

def check_scripts():
    state = load_state()
    alerts = []
    
    for script in MONITORED_SCRIPTS:
        name = script["name"]
        url = script["url"]
        
        try:
            current_hash, size = fetch_and_hash(url)
            previous = state.get(url, {})
            
            if previous and previous.get("hash") != current_hash:
                alerts.append({
                    "name": name,
                    "url": url,
                    "previous_hash": previous.get("hash"),
                    "current_hash": current_hash,
                    "previous_size": previous.get("size"),
                    "current_size": size,
                    "high_value": script.get("high_value", False)
                })
            
            state[url] = {
                "hash": current_hash,
                "size": size,
                "last_checked": datetime.now(timezone.utc).isoformat()
            }
        except Exception as e:
            print(f"Warning: Failed to check {name}: {e}")
    
    save_state(state)
    
    for alert in alerts:
        severity = "CRITICAL" if alert["high_value"] else "WARNING"
        print(f"{severity}: {alert['name']} hash changed!")
        print(f"  URL: {alert['url']}")
        print(f"  Previous: {alert['previous_hash']}")
        print(f"  Current:  {alert['current_hash']}")

if __name__ == "__main__":
    check_scripts()

Expected Behaviour

Scenario Without SRI/CSP With SRI + CSP
CDN domain acquired; malicious script served Browser executes malicious JavaScript; users attacked SRI hash mismatch; browser refuses to execute script
CDN serves dynamic content; hash cannot be precomputed SRI is not possible for dynamic scripts Script is self-hosted; no CDN dependency
CSP blocks unexpected script source No restriction; any script executes CSP blocks script from unlisted source; violation reported
CDN script changes legitimately (library update) No detection; new version runs SRI hash mismatch; page breaks (intentional — requires deliberate hash update)
Script change monitoring detects hash change No monitoring; silent change Alert fires within monitoring interval

Trade-offs

Aspect Benefit Cost Mitigation
SRI on external scripts Prevents modified script execution Page breaks if script changes (even legitimately) Treat SRI hash update as a required step in library upgrade workflow
Strict CSP Prevents execution of unlisted scripts Breaks inline scripts and some third-party integrations Migrate inline scripts to files; audit and list all legitimate sources
Self-hosting external scripts Eliminates CDN supply chain risk entirely Must manage updates; may miss security patches Automate update checks; include in vulnerability scanning pipeline
Script change monitoring Early detection of CDN compromise False alerts on legitimate updates Monitor critical scripts (payment, auth) only; set longer check interval for low-risk scripts

Failure Modes

Failure Symptom Detection Recovery
SRI hash outdated after library update Page fails to load scripts; JavaScript errors in console Browser console errors; CSP violation reports Regenerate SRI hash for new version; update HTML template
CSP blocks legitimate analytics Analytics stops collecting; business team reports data gap Missing data in analytics dashboard Add analytics domain to CSP script-src; test before deploying
Self-hosted script not updated after security release Running outdated version with known CVEs Dependency scanner finds vulnerable version Add vendored script to vulnerability scanning pipeline; alert on new version releases
Script monitor produces false alerts on CDN cache purge Alert fires on minor CDN infrastructure changes Alert fires but hash difference is trivial Compare hash over multiple checks before alerting; use multiple check endpoints