AI-Assisted CVE Patch Prioritisation: EPSS, Reachability, and Business Context

AI-Assisted CVE Patch Prioritisation: EPSS, Reachability, and Business Context

Problem

A medium-sized engineering organisation running 50 services will accumulate hundreds of open CVEs at any given time. Grype or Trivy scanning container images and dependencies produces a backlog that looks manageable in a dashboard — until you try to actually work through it. CVSS scores don’t distinguish between a critical vulnerability in a package that is linked but never called and one in the hottest code path of your most exposed service. A score of 9.8 is frightening on paper; it may be irrelevant if the vulnerable function requires a configuration that no production deployment uses.

Traditional prioritisation approaches have two failure modes:

Severity-only prioritisation leads to patch fatigue and under-patching of genuinely dangerous vulnerabilities. Every CVSS ≥7.0 gets a ticket, engineers are overwhelmed, and the backlog grows until a triage shortcut — “if nothing is actively exploited, skip it” — becomes informal policy. The vulnerabilities that matter get lost in the noise.

Manual triage by security analysts doesn’t scale. Researching the exploitability context of 300 CVEs takes weeks of analyst time per cycle, requires subsystem expertise that no single analyst has, and produces inconsistent quality depending on who does the research.

AI tools address both problems. An LLM with access to the right data sources can, in seconds per CVE, synthesise:

  • EPSS score — the Exploit Prediction Scoring System probability that the CVE will be exploited in the wild within 30 days
  • CISA KEV status — whether the vulnerability is already being actively exploited
  • Reachability analysis — whether the vulnerable code path is actually called in your codebase (static or dynamic analysis result)
  • Deployment context — whether the affected service is internet-facing, handles PII, or has privileged cloud access
  • Compensating controls — whether a WAF rule, network restriction, or configuration change already mitigates the vulnerability
  • Patch complexity — whether the fix is a one-line version bump or a major API refactor

The output is a prioritised, reasoned recommendation: “Patch CVE-X this week because it has 78% EPSS, the vulnerable endpoint is internet-facing and lacks WAF coverage. Defer CVE-Y for 90 days because EPSS is 0.02%, the function is never called in any deployment, and the service is internal-only.”

The risk is over-reliance. An AI that incorrectly classifies a vulnerability as “low reachability” because it misreads the call graph, or that misses a compensating control that was later removed, can cause a genuinely dangerous CVE to be deferred. The AI reduces analyst time but cannot replace analyst judgment on edge cases, and any automated deferral decision needs to be reviewable and reversible.

Target systems: security teams managing vulnerability backlogs of >100 open CVEs; organisations with >10 production services; teams using Grype, Trivy, Snyk, or similar SCA tools; any environment where patch velocity is constrained by competing priorities.


Threat Model

The threat in this context is the risk of incorrect prioritisation:

Risk 1 — False low priority (dangerous deferral). AI incorrectly classifies a critical exploitable vulnerability as low priority. A threat actor exploits it within the deferral window.

Risk 2 — False high priority (alarm fatigue). AI generates too many high-priority recommendations. Engineers cannot work through the backlog, and the genuine high-priority items are delayed by the false positives.

Risk 3 — Stale context (compensating control removed). AI prioritisation correctly accounts for a compensating control (WAF rule). The WAF rule is later removed. The CVE remains in “deferred” status. The compensating control that justified the deferral no longer exists.

Risk 4 — Reachability hallucination. AI reports a function is unreachable based on a misread of the codebase. The function is actually called via reflection or dynamic dispatch that static analysis missed.

All four risks are mitigated by treating AI prioritisation as a recommendation requiring human review, not as an automated decision, and by building an expiry mechanism into deferred decisions.


Configuration / Implementation

Step 1 — Build the data pipeline: CVE context aggregation

Before calling the LLM, aggregate all available context for each CVE:

# cve_context.py — aggregate context for a CVE before LLM analysis
import httpx
import json
from dataclasses import dataclass, field
from typing import Optional

@dataclass
class CVEContext:
    cve_id: str
    cvss_score: float
    cvss_vector: str
    epss_score: float          # 0.0–1.0 probability of exploitation
    epss_percentile: float
    in_cisa_kev: bool          # Actively exploited per CISA
    affected_package: str
    affected_version: str
    fixed_version: Optional[str]
    affected_services: list[str] = field(default_factory=list)
    reachability: str = "unknown"  # "reachable", "unreachable", "unknown"
    service_internet_facing: bool = False
    service_handles_pii: bool = False
    service_has_privileged_cloud_access: bool = False
    compensating_controls: list[str] = field(default_factory=list)
    nvd_description: str = ""

async def fetch_epss(cve_id: str) -> tuple[float, float]:
    """Fetch EPSS score from FIRST API."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api.first.org/data/v1/epss?cve={cve_id}"
        )
        data = response.json()
        if data.get("data"):
            epss = data["data"][0]
            return float(epss["epss"]), float(epss["percentile"])
    return 0.0, 0.0

async def check_cisa_kev(cve_id: str) -> bool:
    """Check if CVE is in CISA Known Exploited Vulnerabilities catalog."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
        )
        kev = response.json()
        return any(v["cveID"] == cve_id for v in kev.get("vulnerabilities", []))

async def build_cve_context(
    cve_id: str,
    affected_package: str,
    affected_version: str,
    service_inventory: dict
) -> CVEContext:
    """Build complete context for AI prioritisation."""
    epss_score, epss_percentile = await fetch_epss(cve_id)
    in_kev = await check_cisa_kev(cve_id)
    
    # Identify affected services from inventory
    affected_services = [
        service for service, deps in service_inventory.items()
        if any(d["package"] == affected_package and d["version"] == affected_version
               for d in deps)
    ]
    
    return CVEContext(
        cve_id=cve_id,
        cvss_score=0.0,  # Filled from scanner output
        cvss_vector="",
        epss_score=epss_score,
        epss_percentile=epss_percentile,
        in_cisa_kev=in_kev,
        affected_package=affected_package,
        affected_version=affected_version,
        affected_services=affected_services,
    )

Step 2 — LLM-based prioritisation reasoning

# prioritiser.py — AI-assisted CVE prioritisation
import anthropic
from enum import Enum

class Priority(Enum):
    CRITICAL = "critical"    # Patch within 24 hours
    HIGH = "high"            # Patch within 7 days
    MEDIUM = "medium"        # Patch within 30 days
    LOW = "low"              # Patch within 90 days
    DEFER = "defer"          # Defer with explicit review date

client = anthropic.Anthropic()

PRIORITISATION_SYSTEM = """You are a vulnerability prioritisation analyst. 
Given context about a CVE and the affected environment, recommend a patch priority.

Your recommendations must be:
- Justified with specific reasoning tied to the provided data
- Conservative when context is uncertain (err toward higher priority)
- Explicit about what would change the recommendation

Priority definitions:
- CRITICAL: CISA KEV, internet-facing, exploitable, no compensating controls
- HIGH: High EPSS (>0.3), internet-facing, or handles PII/privileged access
- MEDIUM: Medium EPSS (0.05-0.3), reachable, internal service
- LOW: Low EPSS (<0.05), reachable, internal service, compensating controls exist
- DEFER: Very low EPSS (<0.01), reachability unknown or unreachable, internal only

IMPORTANT: Never recommend DEFER for:
- CVEs in CISA KEV
- CVEs with EPSS > 0.3
- Services with internet-facing exposure and no compensating controls"""

def prioritise_cve(ctx: CVEContext) -> dict:
    """Use LLM to generate prioritised recommendation with reasoning."""
    
    context_summary = f"""
CVE: {ctx.cve_id}
CVSS Score: {ctx.cvss_score} ({ctx.cvss_vector})
EPSS Score: {ctx.epss_score:.4f} ({ctx.epss_percentile:.0%} percentile)
In CISA KEV: {ctx.in_cisa_kev}
Affected Package: {ctx.affected_package} v{ctx.affected_version}
Fixed Version Available: {ctx.fixed_version or 'No fix available yet'}
Affected Services: {', '.join(ctx.affected_services) if ctx.affected_services else 'None identified'}
Internet-Facing Services Affected: {ctx.service_internet_facing}
Handles PII: {ctx.service_handles_pii}
Privileged Cloud Access: {ctx.service_has_privileged_cloud_access}
Code Reachability: {ctx.reachability}
Compensating Controls: {', '.join(ctx.compensating_controls) if ctx.compensating_controls else 'None'}
CVE Description: {ctx.nvd_description[:500]}
"""
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=800,
        system=PRIORITISATION_SYSTEM,
        messages=[{
            "role": "user",
            "content": f"""Prioritise this vulnerability:

{context_summary}

Provide:
1. Priority: [CRITICAL/HIGH/MEDIUM/LOW/DEFER]
2. Reasoning: (2-3 sentences explaining the key factors)
3. Key risk: (the specific scenario that would materialise if unpatched)
4. Patch recommendation: (specific action — version bump, config change, etc.)
5. Deferral expiry: (if DEFER/LOW, when to re-evaluate — must have a date)
6. What would escalate this: (conditions that would move this to higher priority)"""
        }]
    )
    
    return {
        "cve_id": ctx.cve_id,
        "recommendation": response.content[0].text,
        "ai_generated": True,
        "requires_human_review": ctx.cvss_score >= 7.0 or ctx.in_cisa_kev or ctx.epss_score > 0.3,
        "context": ctx,
    }

Step 3 — Build the prioritisation pipeline with mandatory review gates

from datetime import datetime, timedelta
from typing import list

def run_prioritisation_batch(
    cve_contexts: list[CVEContext],
    analyst_name: str
) -> list[dict]:
    """Prioritise a batch of CVEs with mandatory review for high-severity items."""
    
    results = []
    auto_approved = []
    requires_review = []
    
    for ctx in cve_contexts:
        recommendation = prioritise_cve(ctx)
        
        # High-stakes decisions require human review
        if recommendation["requires_human_review"]:
            requires_review.append(recommendation)
        else:
            auto_approved.append(recommendation)
    
    print(f"\n=== Prioritisation Complete ===")
    print(f"Auto-approved (low/medium risk): {len(auto_approved)}")
    print(f"Requires human review (high/critical or high EPSS): {len(requires_review)}")
    
    # Auto-approved items: create tickets with deferred dates
    for rec in auto_approved:
        print(f"\n✓ AUTO: {rec['cve_id']}")
        print(f"  {rec['recommendation'][:200]}...")
        results.append({
            **rec,
            "approved_by": "automated",
            "approved_at": datetime.utcnow().isoformat(),
        })
    
    # Review-required items: present for analyst decision
    print(f"\n{'='*50}")
    print("ITEMS REQUIRING HUMAN REVIEW:")
    for rec in requires_review:
        print(f"\n⚠️  REVIEW REQUIRED: {rec['cve_id']}")
        print(rec['recommendation'])
        print(f"\nContext: CVSS={rec['context'].cvss_score}, "
              f"EPSS={rec['context'].epss_score:.4f}, "
              f"KEV={rec['context'].in_cisa_kev}")
        
        decision = input(f"\nApprove AI recommendation for {rec['cve_id']}? [y/N/override]: ")
        
        if decision.lower() == 'y':
            results.append({
                **rec,
                "approved_by": analyst_name,
                "approved_at": datetime.utcnow().isoformat(),
            })
        elif decision.lower() == 'override':
            override_priority = input("Enter override priority [CRITICAL/HIGH/MEDIUM/LOW/DEFER]: ")
            override_reason = input("Override reason: ")
            results.append({
                **rec,
                "priority_override": override_priority,
                "override_reason": override_reason,
                "approved_by": analyst_name,
                "approved_at": datetime.utcnow().isoformat(),
            })
        else:
            results.append({
                **rec,
                "status": "pending_review",
                "flagged_by": analyst_name,
            })
    
    return results

Step 4 — Compensating control expiry tracking

Deferred decisions that rely on compensating controls must have expiry dates:

# vulnerability-deferrals.yaml — track deferred CVEs with context
deferrals:
- cve_id: CVE-2026-XXXXX
  package: example-lib
  version: "1.4.2"
  priority: LOW
  rationale: "EPSS 0.008; function unreachable in production config; internal-only service"
  deferred_by: analyst@example.com
  deferred_at: "2026-05-12"
  review_date: "2026-08-12"   # Mandatory re-evaluation in 90 days
  compensating_controls:
  - description: "Service not internet-facing; internal VPN only"
    type: network_restriction
    verified_date: "2026-05-12"
    expiry_check: "Verify VPN-only access monthly via network scan"
  escalation_conditions:
  - "EPSS rises above 0.1"
  - "CISA adds to KEV catalog"
  - "Service becomes internet-facing"
  - "VPN access restriction removed"
def check_deferral_expiry(deferrals: list[dict]) -> list[dict]:
    """Find deferrals that need review — expired or context changed."""
    today = datetime.today().date()
    due_for_review = []
    
    for deferral in deferrals:
        review_date = datetime.strptime(deferral["review_date"], "%Y-%m-%d").date()
        
        if review_date <= today:
            due_for_review.append({
                **deferral,
                "overdue_days": (today - review_date).days,
                "reason": "Review date reached"
            })
    
    return due_for_review

Expected Behaviour

Signal Manual triage AI-assisted
Time to triage 100 CVEs 8–20 analyst hours 1–2 hours (AI + review of flagged items)
CVEs requiring human review per cycle 100 (all) ~20 (CVSS ≥7 or EPSS >0.3)
Deferred CVEs with expiry dates Rarely tracked Every deferral has a review_date
CISA KEV auto-escalation Manual check Automated via KEV feed check
Context: “is the function reachable?” Manual code review EPSS + reachability input; flagged for human confirmation

Verification:

# Test: CISA KEV item must not receive DEFER recommendation
test_ctx = CVEContext(
    cve_id="CVE-2021-44228",  # Log4Shell — in KEV
    cvss_score=10.0,
    cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H",
    epss_score=0.97555,
    epss_percentile=1.0,
    in_cisa_kev=True,
    affected_package="log4j-core",
    affected_version="2.14.1",
    fixed_version="2.17.1",
)

result = prioritise_cve(test_ctx)
assert "DEFER" not in result["recommendation"].upper() or \
    "CRITICAL" in result["recommendation"].upper(), \
    "Log4Shell should never be deferred"
print(f"Log4Shell priority: {result['recommendation'][:100]}")

Trade-offs

Aspect Benefit Cost Mitigation
AI triage for low-risk CVEs Frees analyst time for complex cases AI may misclassify a low-EPSS CVE that is being actively exploited but not yet in EPSS/KEV Set conservative thresholds; require human review for anything touching internet-facing services regardless of EPSS
EPSS as primary signal Empirically validated exploitation probability EPSS is reactive — it rises after exploitation begins; novel CVEs have low initial EPSS Supplement EPSS with threat intelligence feeds; never use EPSS alone for internet-facing services
Deferral expiry dates Prevents permanently deferred CVEs Requires process discipline to act on expiry alerts Integrate expiry checks into your weekly security meeting agenda; auto-create tickets on expiry
AI reasoning transparency Recommendation includes explicit rationale Analysts may rubber-stamp without reading Require explicit acknowledgment of the key risk statement in each recommendation

Failure Modes

Failure Symptom Detection Recovery
EPSS feed unavailable All CVEs get EPSS 0.0; many incorrectly deferred EPSS field is 0.0 for all CVEs in the batch Detect zero-EPSS batches and block auto-approval until feed is restored
Reachability false negative AI reports function unreachable; function is called via reflection CVE exploited in production; post-incident reachability analysis finds the call path Mark reachability as “unknown” when static analysis is inconclusive; treat unknown as reachable
Compensating control silently removed Deferred CVE has no active protection; review date not reached yet Infrastructure change not linked to CVE deferral tracker Integrate change management with deferral tracker; any change to a listed compensating control triggers re-evaluation
LLM inconsistency across runs Same CVE gets different priority on different days Batch prioritisation produces different results when re-run Pin model version; use temperature=0 for deterministic outputs; log all recommendations for audit