AI-Assisted CVE Patch Prioritisation: EPSS, Reachability, and Business Context

Problem

A medium-sized engineering organisation running 50 services will accumulate hundreds of open CVEs at any given time. Grype or Trivy scanning container images and dependencies produces a backlog that looks manageable in a dashboard — until you try to actually work through it. CVSS scores don’t distinguish between a critical vulnerability in a package that is linked but never called and one in the hottest code path of your most exposed service. A score of 9.8 is frightening on paper; it may be irrelevant if the vulnerable function requires a configuration that no production deployment uses.

Traditional prioritisation approaches have two failure modes:

Severity-only prioritisation leads to patch fatigue and under-patching of genuinely dangerous vulnerabilities. Every CVSS ≥7.0 gets a ticket, engineers are overwhelmed, and the backlog grows until a triage shortcut — “if nothing is actively exploited, skip it” — becomes informal policy. The vulnerabilities that matter get lost in the noise.

Manual triage by security analysts doesn’t scale. Researching the exploitability context of 300 CVEs takes weeks of analyst time per cycle, requires subsystem expertise that no single analyst has, and produces inconsistent quality depending on who does the research.

AI tools address both problems. An LLM with access to the right data sources can, in seconds per CVE, synthesise:

EPSS score — the Exploit Prediction Scoring System probability that the CVE will be exploited in the wild within 30 days
CISA KEV status — whether the vulnerability is already being actively exploited
Reachability analysis — whether the vulnerable code path is actually called in your codebase (static or dynamic analysis result)
Deployment context — whether the affected service is internet-facing, handles PII, or has privileged cloud access
Compensating controls — whether a WAF rule, network restriction, or configuration change already mitigates the vulnerability
Patch complexity — whether the fix is a one-line version bump or a major API refactor

The output is a prioritised, reasoned recommendation: “Patch CVE-X this week because it has 78% EPSS, the vulnerable endpoint is internet-facing and lacks WAF coverage. Defer CVE-Y for 90 days because EPSS is 0.02%, the function is never called in any deployment, and the service is internal-only.”

The risk is over-reliance. An AI that incorrectly classifies a vulnerability as “low reachability” because it misreads the call graph, or that misses a compensating control that was later removed, can cause a genuinely dangerous CVE to be deferred. The AI reduces analyst time but cannot replace analyst judgment on edge cases, and any automated deferral decision needs to be reviewable and reversible.

Target systems: security teams managing vulnerability backlogs of >100 open CVEs; organisations with >10 production services; teams using Grype, Trivy, Snyk, or similar SCA tools; any environment where patch velocity is constrained by competing priorities.

Threat Model

The threat in this context is the risk of incorrect prioritisation:

Risk 1 — False low priority (dangerous deferral). AI incorrectly classifies a critical exploitable vulnerability as low priority. A threat actor exploits it within the deferral window.

Risk 2 — False high priority (alarm fatigue). AI generates too many high-priority recommendations. Engineers cannot work through the backlog, and the genuine high-priority items are delayed by the false positives.

Risk 3 — Stale context (compensating control removed). AI prioritisation correctly accounts for a compensating control (WAF rule). The WAF rule is later removed. The CVE remains in “deferred” status. The compensating control that justified the deferral no longer exists.

Risk 4 — Reachability hallucination. AI reports a function is unreachable based on a misread of the codebase. The function is actually called via reflection or dynamic dispatch that static analysis missed.

All four risks are mitigated by treating AI prioritisation as a recommendation requiring human review, not as an automated decision, and by building an expiry mechanism into deferred decisions.

Configuration / Implementation

Step 1 — Build the data pipeline: CVE context aggregation

Before calling the LLM, aggregate all available context for each CVE:

# cve_context.py — aggregate context for a CVE before LLM analysis
import httpx
import json
from dataclasses import dataclass, field
from typing import Optional

@dataclass
class CVEContext:
    cve_id: str
    cvss_score: float
    cvss_vector: str
    epss_score: float          # 0.0–1.0 probability of exploitation
    epss_percentile: float
    in_cisa_kev: bool          # Actively exploited per CISA
    affected_package: str
    affected_version: str
    fixed_version: Optional[str]
    affected_services: list[str] = field(default_factory=list)
    reachability: str = "unknown"  # "reachable", "unreachable", "unknown"
    service_internet_facing: bool = False
    service_handles_pii: bool = False
    service_has_privileged_cloud_access: bool = False
    compensating_controls: list[str] = field(default_factory=list)
    nvd_description: str = ""

async def fetch_epss(cve_id: str) -> tuple[float, float]:
    """Fetch EPSS score from FIRST API."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api.first.org/data/v1/epss?cve={cve_id}"
        )
        data = response.json()
        if data.get("data"):
            epss = data["data"][0]
            return float(epss["epss"]), float(epss["percentile"])
    return 0.0, 0.0

async def check_cisa_kev(cve_id: str) -> bool:
    """Check if CVE is in CISA Known Exploited Vulnerabilities catalog."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
        )
        kev = response.json()
        return any(v["cveID"] == cve_id for v in kev.get("vulnerabilities", []))

async def build_cve_context(
    cve_id: str,
    affected_package: str,
    affected_version: str,
    service_inventory: dict
) -> CVEContext:
    """Build complete context for AI prioritisation."""
    epss_score, epss_percentile = await fetch_epss(cve_id)
    in_kev = await check_cisa_kev(cve_id)
    
    # Identify affected services from inventory
    affected_services = [
        service for service, deps in service_inventory.items()
        if any(d["package"] == affected_package and d["version"] == affected_version
               for d in deps)
    ]
    
    return CVEContext(
        cve_id=cve_id,
        cvss_score=0.0,  # Filled from scanner output
        cvss_vector="",
        epss_score=epss_score,
        epss_percentile=epss_percentile,
        in_cisa_kev=in_kev,
        affected_package=affected_package,
        affected_version=affected_version,
        affected_services=affected_services,
    )

Step 2 — LLM-based prioritisation reasoning

# prioritiser.py — AI-assisted CVE prioritisation
import anthropic
from enum import Enum

class Priority(Enum):
    CRITICAL = "critical"    # Patch within 24 hours
    HIGH = "high"            # Patch within 7 days
    MEDIUM = "medium"        # Patch within 30 days
    LOW = "low"              # Patch within 90 days
    DEFER = "defer"          # Defer with explicit review date

client = anthropic.Anthropic()

PRIORITISATION_SYSTEM = """You are a vulnerability prioritisation analyst. 
Given context about a CVE and the affected environment, recommend a patch priority.

Your recommendations must be:
- Justified with specific reasoning tied to the provided data
- Conservative when context is uncertain (err toward higher priority)
- Explicit about what would change the recommendation

Priority definitions:
- CRITICAL: CISA KEV, internet-facing, exploitable, no compensating controls
- HIGH: High EPSS (>0.3), internet-facing, or handles PII/privileged access
- MEDIUM: Medium EPSS (0.05-0.3), reachable, internal service
- LOW: Low EPSS (<0.05), reachable, internal service, compensating controls exist
- DEFER: Very low EPSS (<0.01), reachability unknown or unreachable, internal only

IMPORTANT: Never recommend DEFER for:
- CVEs in CISA KEV
- CVEs with EPSS > 0.3
- Services with internet-facing exposure and no compensating controls"""

def prioritise_cve(ctx: CVEContext) -> dict:
    """Use LLM to generate prioritised recommendation with reasoning."""
    
    context_summary = f"""
CVE: {ctx.cve_id}
CVSS Score: {ctx.cvss_score} ({ctx.cvss_vector})
EPSS Score: {ctx.epss_score:.4f} ({ctx.epss_percentile:.0%} percentile)
In CISA KEV: {ctx.in_cisa_kev}
Affected Package: {ctx.affected_package} v{ctx.affected_version}
Fixed Version Available: {ctx.fixed_version or 'No fix available yet'}
Affected Services: {', '.join(ctx.affected_services) if ctx.affected_services else 'None identified'}
Internet-Facing Services Affected: {ctx.service_internet_facing}
Handles PII: {ctx.service_handles_pii}
Privileged Cloud Access: {ctx.service_has_privileged_cloud_access}
Code Reachability: {ctx.reachability}
Compensating Controls: {', '.join(ctx.compensating_controls) if ctx.compensating_controls else 'None'}
CVE Description: {ctx.nvd_description[:500]}
"""
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=800,
        system=PRIORITISATION_SYSTEM,
        messages=[{
            "role": "user",
            "content": f"""Prioritise this vulnerability:

{context_summary}

Provide:
1. Priority: [CRITICAL/HIGH/MEDIUM/LOW/DEFER]
2. Reasoning: (2-3 sentences explaining the key factors)
3. Key risk: (the specific scenario that would materialise if unpatched)
4. Patch recommendation: (specific action — version bump, config change, etc.)
5. Deferral expiry: (if DEFER/LOW, when to re-evaluate — must have a date)
6. What would escalate this: (conditions that would move this to higher priority)"""
        }]
    )
    
    return {
        "cve_id": ctx.cve_id,
        "recommendation": response.content[0].text,
        "ai_generated": True,
        "requires_human_review": ctx.cvss_score >= 7.0 or ctx.in_cisa_kev or ctx.epss_score > 0.3,
        "context": ctx,
    }

Step 3 — Build the prioritisation pipeline with mandatory review gates

from datetime import datetime, timedelta
from typing import list

def run_prioritisation_batch(
    cve_contexts: list[CVEContext],
    analyst_name: str
) -> list[dict]:
    """Prioritise a batch of CVEs with mandatory review for high-severity items."""
    
    results = []
    auto_approved = []
    requires_review = []
    
    for ctx in cve_contexts:
        recommendation = prioritise_cve(ctx)
        
        # High-stakes decisions require human review
        if recommendation["requires_human_review"]:
            requires_review.append(recommendation)
        else:
            auto_approved.append(recommendation)
    
    print(f"\n=== Prioritisation Complete ===")
    print(f"Auto-approved (low/medium risk): {len(auto_approved)}")
    print(f"Requires human review (high/critical or high EPSS): {len(requires_review)}")
    
    # Auto-approved items: create tickets with deferred dates
    for rec in auto_approved:
        print(f"\n✓ AUTO: {rec['cve_id']}")
        print(f"  {rec['recommendation'][:200]}...")
        results.append({
            **rec,
            "approved_by": "automated",
            "approved_at": datetime.utcnow().isoformat(),
        })
    
    # Review-required items: present for analyst decision
    print(f"\n{'='*50}")
    print("ITEMS REQUIRING HUMAN REVIEW:")
    for rec in requires_review:
        print(f"\n⚠️  REVIEW REQUIRED: {rec['cve_id']}")
        print(rec['recommendation'])
        print(f"\nContext: CVSS={rec['context'].cvss_score}, "
              f"EPSS={rec['context'].epss_score:.4f}, "
              f"KEV={rec['context'].in_cisa_kev}")
        
        decision = input(f"\nApprove AI recommendation for {rec['cve_id']}? [y/N/override]: ")
        
        if decision.lower() == 'y':
            results.append({
                **rec,
                "approved_by": analyst_name,
                "approved_at": datetime.utcnow().isoformat(),
            })
        elif decision.lower() == 'override':
            override_priority = input("Enter override priority [CRITICAL/HIGH/MEDIUM/LOW/DEFER]: ")
            override_reason = input("Override reason: ")
            results.append({
                **rec,
                "priority_override": override_priority,
                "override_reason": override_reason,
                "approved_by": analyst_name,
                "approved_at": datetime.utcnow().isoformat(),
            })
        else:
            results.append({
                **rec,
                "status": "pending_review",
                "flagged_by": analyst_name,
            })
    
    return results

Step 4 — Compensating control expiry tracking

Deferred decisions that rely on compensating controls must have expiry dates:

# vulnerability-deferrals.yaml — track deferred CVEs with context
deferrals:
- cve_id: CVE-2026-XXXXX
  package: example-lib
  version: "1.4.2"
  priority: LOW
  rationale: "EPSS 0.008; function unreachable in production config; internal-only service"
  deferred_by: analyst@example.com
  deferred_at: "2026-05-12"
  review_date: "2026-08-12"   # Mandatory re-evaluation in 90 days
  compensating_controls:
  - description: "Service not internet-facing; internal VPN only"
    type: network_restriction
    verified_date: "2026-05-12"
    expiry_check: "Verify VPN-only access monthly via network scan"
  escalation_conditions:
  - "EPSS rises above 0.1"
  - "CISA adds to KEV catalog"
  - "Service becomes internet-facing"
  - "VPN access restriction removed"

def check_deferral_expiry(deferrals: list[dict]) -> list[dict]:
    """Find deferrals that need review — expired or context changed."""
    today = datetime.today().date()
    due_for_review = []
    
    for deferral in deferrals:
        review_date = datetime.strptime(deferral["review_date"], "%Y-%m-%d").date()
        
        if review_date <= today:
            due_for_review.append({
                **deferral,
                "overdue_days": (today - review_date).days,
                "reason": "Review date reached"
            })
    
    return due_for_review

Expected Behaviour

Signal	Manual triage	AI-assisted
Time to triage 100 CVEs	8–20 analyst hours	1–2 hours (AI + review of flagged items)
CVEs requiring human review per cycle	100 (all)	~20 (CVSS ≥7 or EPSS >0.3)
Deferred CVEs with expiry dates	Rarely tracked	Every deferral has a `review_date`
CISA KEV auto-escalation	Manual check	Automated via KEV feed check
Context: “is the function reachable?”	Manual code review	EPSS + reachability input; flagged for human confirmation

Verification:

# Test: CISA KEV item must not receive DEFER recommendation
test_ctx = CVEContext(
    cve_id="CVE-2021-44228",  # Log4Shell — in KEV
    cvss_score=10.0,
    cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H",
    epss_score=0.97555,
    epss_percentile=1.0,
    in_cisa_kev=True,
    affected_package="log4j-core",
    affected_version="2.14.1",
    fixed_version="2.17.1",
)

result = prioritise_cve(test_ctx)
assert "DEFER" not in result["recommendation"].upper() or \
    "CRITICAL" in result["recommendation"].upper(), \
    "Log4Shell should never be deferred"
print(f"Log4Shell priority: {result['recommendation'][:100]}")

Trade-offs

Aspect	Benefit	Cost	Mitigation
AI triage for low-risk CVEs	Frees analyst time for complex cases	AI may misclassify a low-EPSS CVE that is being actively exploited but not yet in EPSS/KEV	Set conservative thresholds; require human review for anything touching internet-facing services regardless of EPSS
EPSS as primary signal	Empirically validated exploitation probability	EPSS is reactive — it rises after exploitation begins; novel CVEs have low initial EPSS	Supplement EPSS with threat intelligence feeds; never use EPSS alone for internet-facing services
Deferral expiry dates	Prevents permanently deferred CVEs	Requires process discipline to act on expiry alerts	Integrate expiry checks into your weekly security meeting agenda; auto-create tickets on expiry
AI reasoning transparency	Recommendation includes explicit rationale	Analysts may rubber-stamp without reading	Require explicit acknowledgment of the key risk statement in each recommendation

Failure Modes

Failure	Symptom	Detection	Recovery
EPSS feed unavailable	All CVEs get EPSS 0.0; many incorrectly deferred	EPSS field is 0.0 for all CVEs in the batch	Detect zero-EPSS batches and block auto-approval until feed is restored
Reachability false negative	AI reports function unreachable; function is called via reflection	CVE exploited in production; post-incident reachability analysis finds the call path	Mark reachability as “unknown” when static analysis is inconclusive; treat unknown as reachable
Compensating control silently removed	Deferred CVE has no active protection; review date not reached yet	Infrastructure change not linked to CVE deferral tracker	Integrate change management with deferral tracker; any change to a listed compensating control triggers re-evaluation
LLM inconsistency across runs	Same CVE gets different priority on different days	Batch prioritisation produces different results when re-run	Pin model version; use `temperature=0` for deterministic outputs; log all recommendations for audit

Vulnerability Management Program — the process framework within which AI prioritisation operates
Zero-Day Response Playbook — the emergency escalation path for CVEs that bypass the normal prioritisation queue
AI-Assisted Hardening — broader use of AI tools for security hardening, of which patch prioritisation is one component
SBOM Generation and Consumption — the SBOM data that feeds the vulnerability inventory used by AI prioritisation
Container Vulnerability Scanning in CI — generating the CVE findings that feed into the AI prioritisation pipeline