OpenSSF Scorecard: Automated Open Source Dependency Risk Scoring

OpenSSF Scorecard: Automated Open Source Dependency Risk Scoring

What Scorecard Measures

OpenSSF Scorecard evaluates a GitHub-hosted open source project across more than 20 automated checks and aggregates them into a composite score from 0 to 10. The composite is a weighted average: checks that have a larger blast radius — Dangerous-Workflow, Binary-Artifacts, Pinned-Dependencies — carry more weight than informational checks like Fuzzing. Each check returns an integer score from 0 to 10, a one-line reason, and a documentation URL. The full result is available in JSON or SARIF format.

The checks that matter most for dependency risk decisions:

Branch-Protection measures whether the default branch enforces code review before merge, prevents force pushes, and requires status checks. A score of 0 means any contributor with push access can merge code directly without review. A score of 8 or above typically means branch protection requires at least one reviewer, dismisses stale approvals, and enforces linear history.

CI-Tests checks whether the repository runs automated tests on pull requests. A project that never runs CI on proposed changes cannot detect whether a patch breaks behaviour, including security behaviour. Score 0 means no CI evidence; score 10 means all recent commits have associated CI runs.

Contributors assesses whether the project has a diverse set of contributors from multiple organisations. A project with a single contributor from a single organisation is one compromised account away from full supply chain compromise. Higher scores indicate organisational breadth.

Maintained checks commit frequency over the past 90 days. Score 0 means no commits in 90 days; score 10 means sustained recent activity. A score below 3 means that even if a vulnerability is reported, a fix is unlikely to arrive promptly.

Pinned-Dependencies checks whether the project’s own CI and build dependencies are pinned to immutable references — commit digests rather than mutable tags. A CI step using actions/checkout@v4 (a mutable tag) scores lower than one using actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683. Score 0 means all dependencies use mutable tags; score 10 means all are pinned to digests.

Signed-Releases checks whether published release artifacts are cryptographically signed. Score 0 means no signing evidence; score 8 or above means release artifacts are signed and verifiable by downstream consumers. Projects that publish to npm, PyPI, or container registries without signing provide no integrity guarantee beyond what the registry itself offers.

Token-Permissions checks whether the repository’s GitHub Actions workflows request only the minimum token permissions required. Workflows that request contents: write or packages: write by default when they only need contents: read expose a larger blast radius if the workflow is compromised.

Vulnerabilities checks whether the project has open, unaddressed CVEs in the OSV database. This is the only check that directly overlaps with traditional vulnerability scanning — the others measure process maturity. A failing Vulnerabilities check means the project has known, unpatched CVEs in its own codebase.

SAST checks whether the project runs static analysis tooling — CodeQL, Semgrep, Coverity — as part of CI. Score 0 means no SAST evidence; higher scores indicate automated static analysis runs on new code.

Security-Policy checks for the presence of a SECURITY.md file with a private disclosure channel. Score 0 means no SECURITY.md exists; score 10 means a policy is present and parseable. No security policy means vulnerability researchers have no private channel: they will open public issues or public PRs, eliminating any coordination window for downstream consumers.

Dependency-Update-Tool checks whether the project uses Dependabot, Renovate, or an equivalent tool to keep its own dependencies current. Score 0 means no automated updates; score 10 means an update tool is configured and active. A project that does not update its own dependencies is accumulating transitive vulnerability exposure that your vulnerability scanner will eventually surface.

Binary-Artifacts checks for pre-built binaries committed to the repository. Committed binaries cannot be verified against source; they are a well-established supply chain compromise vector. Score 0 means binaries are present; score 10 means none are committed.

Dangerous-Workflow checks for GitHub Actions patterns that run untrusted code in a privileged context — specifically pull_request_target combined with checkout of the PR head, which grants fork PRs access to repository secrets. Score 0 means one or more dangerous patterns are present; score 10 means none are detected. This is one of the highest-weight checks because a failing result indicates an exploitable attack surface against the project’s own CI.

Running Scorecard Locally

Install the CLI and run it against any public GitHub repository. Authentication is required to avoid aggressive API rate limiting:

export GITHUB_AUTH_TOKEN="ghp_your_token_here"

scorecard --repo=github.com/grpc/grpc-go --format json | jq .

The JSON output structure has three top-level fields: repo, score (the composite 0–10 float), and checks (array of per-check results). Each check object contains name, score, reason, details (array of strings), and documentation.url.

Pull only the fields relevant for a pass/fail decision:

scorecard --repo=github.com/grpc/grpc-go --format json \
  | jq '[.checks[] | {name, score, reason}] | sort_by(.score)'

Show only checks below a threshold of 5:

scorecard --repo=github.com/hashicorp/vault --format json \
  | jq '[.checks[] | select(.score < 5)] | sort_by(.score)'

Extract the composite score for scripting:

score=$(scorecard --repo=github.com/some-org/some-repo --format json \
  | jq -r '.score')
echo "Composite score: ${score}"

To score a batch of direct dependencies — for example, extracted from a Go module graph:

#!/usr/bin/env bash
set -euo pipefail

THRESHOLD=5
FAILED=0

go list -m all \
  | grep '^github\.com/' \
  | awk '{print $1}' \
  | sort -u \
  | while read -r module; do
      repo=$(echo "$module" | cut -d'/' -f1-3)
      result=$(scorecard --repo="${repo}" --format json 2>/dev/null \
                 || echo '{"score":0,"checks":[]}')
      composite=$(echo "$result" | jq -r '.score // 0')
      if awk "BEGIN{exit !($composite < $THRESHOLD)}"; then
        echo "FAIL score=${composite} repo=${repo}"
        FAILED=1
      else
        echo "PASS score=${composite} repo=${repo}"
      fi
    done

exit "$FAILED"

Scorecard makes one GitHub API call per check per repository. Scoring a single repository consumes roughly 40–60 API requests. Against the default authenticated rate limit of 5,000 requests per hour, you can score approximately 80–120 repositories per hour per token.

GitHub Actions Integration

The ossf/scorecard-action workflow scores your own repository on each push and on a weekly schedule. It publishes SARIF output to the GitHub Security tab, where findings appear alongside CodeQL alerts:

name: Scorecard supply chain security
on:
  push:
    branches: [main]
  schedule:
    - cron: "15 2 * * 1"
  pull_request:
    branches: [main]

permissions:
  security-events: write
  id-token: write
  contents: read
  actions: read

jobs:
  scorecard:
    name: Scorecard analysis
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683
        with:
          persist-credentials: false

      - uses: ossf/scorecard-action@f49aabe0b5af0936a0987cfb85d86b75d87d3b5e
        with:
          results_file: scorecard.sarif
          results_format: sarif
          publish_results: true

      - uses: actions/upload-artifact@v4
        with:
          name: scorecard-sarif
          path: scorecard.sarif
          retention-days: 5

      - uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: scorecard.sarif

Pin the scorecard-action to a commit digest rather than a tag. Scorecard itself will flag this workflow otherwise — a tag-pinned Scorecard job fails its own Pinned-Dependencies check.

To gate dependency updates in CI — blocking a pull request that introduces a new dependency with a low Scorecard score — add a step that reads from a dependency manifest and invokes Scorecard for each new or changed entry:

- name: Score new dependencies
  env:
    GITHUB_AUTH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  run: |
    THRESHOLD=5
    FAILED=0
    while IFS= read -r repo; do
      [[ -z "$repo" || "$repo" == \#* ]] && continue
      score=$(scorecard --repo="${repo}" --format json 2>/dev/null \
               | jq -r '.score // 0')
      if awk "BEGIN{exit !($score < $THRESHOLD)}"; then
        echo "::error::Scorecard score ${score}/10 is below threshold for ${repo}"
        FAILED=1
      else
        echo "::notice::Scorecard score ${score}/10 for ${repo}"
      fi
    done < .github/tracked-deps.txt
    exit "$FAILED"

The .github/tracked-deps.txt file contains one github.com/org/repo entry per line. Update it as part of the PR that adds a new dependency.

Interpreting Scores

A composite score of 7 or above indicates a project with meaningful security process maturity: branch protection is configured, code review is enforced, dependencies are updated automatically, and releases are likely signed. It does not mean the project has no vulnerabilities — it means the processes that would surface and fix vulnerabilities are in place.

A composite score between 4 and 6 indicates partial adoption. Typically this means some checks pass (CI exists, no dangerous workflows) but critical ones fail (no signed releases, no security policy, dependencies not pinned). These projects are worth using with compensating controls: pin to exact version, enable GitHub watch notifications, check commit history manually before updating.

A composite score below 4 indicates a project with significant process gaps. Multiple high-weight checks are failing. Adopting this dependency means accepting that you will likely not receive advance notice of vulnerability fixes, cannot verify release artifact integrity, and may not detect if the project has been compromised.

When prioritising which failing checks to remediate on your own repositories, address them in this order:

  1. Dangerous-Workflow — directly exploitable attack surface; fix immediately
  2. Binary-Artifacts — committed binaries cannot be audited; remove them
  3. Token-Permissions — over-privileged workflows amplify any compromise
  4. Pinned-Dependencies — mutable CI dependencies are a persistent attack surface
  5. Branch-Protection — without review enforcement, any of the above can regress
  6. Signed-Releases — downstream consumers cannot verify artifacts without this
  7. Security-Policy — without a disclosure channel, vulnerabilities become public immediately
  8. Dependency-Update-Tool — without automation, transitive exposure accumulates silently

deps.dev API

Google’s deps.dev aggregates Scorecard results for packages in npm, Go, PyPI, Maven, and Cargo ecosystems. The API provides Scorecard scores alongside dependency graph data and known advisories in a single request, without consuming GitHub API quota:

curl -s "https://api.deps.dev/v3alpha/systems/npm/packages/express/versions/4.18.2" \
  | jq '{
      name:      .versionKey.name,
      version:   .versionKey.version,
      scorecard: .scorecard.overallScore,
      advisories: [.advisoryKeys[].id]
    }'

For Go modules:

curl -s "https://api.deps.dev/v3alpha/systems/go/packages/github.com%2Fgorilla%2Fmux/versions/v1.8.1" \
  | jq '{
      scorecard: .scorecard.overallScore,
      checks: [.scorecard.checks[] | {name, score}]
    }'

Retrieve the full dependency graph for a package version — useful for identifying which transitive dependencies to score next:

curl -s "https://api.deps.dev/v3alpha/systems/npm/packages/webpack/versions/5.91.0:dependencies" \
  | jq '[.nodes[] | {name: .versionKey.name, version: .versionKey.version}]'

The deps.dev API returns cached Scorecard results that may be a few days old. For freshness-sensitive decisions (for example, before adopting a new dependency), run the Scorecard CLI directly to get a current score.

Query deps.dev in a loop to build a scored inventory of all direct dependencies from a lockfile without consuming GitHub API quota:

#!/usr/bin/env bash
set -euo pipefail

# Requires jq and a package-lock.json
jq -r '.packages | to_entries[]
        | select(.key != "" and (.key | startswith("node_modules/")))
        | {name: (.key | ltrimstr("node_modules/")), version: .value.version}
        | "\(.name) \(.version)"' package-lock.json \
  | sort -u \
  | while read -r name version; do
      encoded_name=$(python3 -c "import urllib.parse; print(urllib.parse.quote('${name}', safe=''))")
      result=$(curl -sf "https://api.deps.dev/v3alpha/systems/npm/packages/${encoded_name}/versions/${version}" \
                 || echo '{}')
      scorecard=$(echo "$result" | jq -r '.scorecard.overallScore // "N/A"')
      echo "${scorecard}  ${name}@${version}"
    done \
  | sort -n

Using Scorecard as a PR Gate

Blocking a dependency update PR when the target package scores below a threshold requires detecting which packages are being added or changed. Extract the diff of the lockfile and score only changed entries:

- name: Detect lockfile changes
  id: lockdiff
  run: |
    git diff origin/main -- package-lock.json \
      | grep '^+.*"resolved"' \
      | grep -oP 'https://registry\.npmjs\.org/[^/-]+' \
      | sed 's|https://registry\.npmjs\.org/||' \
      | sort -u > /tmp/new-packages.txt
    cat /tmp/new-packages.txt

- name: Gate on Scorecard scores
  env:
    GITHUB_AUTH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  run: |
    THRESHOLD=5
    FAILED=0
    while IFS= read -r pkg; do
      [[ -z "$pkg" ]] && continue
      version=$(jq -r \
        --arg pkg "$pkg" \
        '.packages["node_modules/\($pkg)"].version // "latest"' \
        package-lock.json)
      encoded=$(python3 -c "import urllib.parse,sys; print(urllib.parse.quote(sys.argv[1],safe=''))" "$pkg")
      sc=$(curl -sf "https://api.deps.dev/v3alpha/systems/npm/packages/${encoded}/versions/${version}" \
             | jq -r '.scorecard.overallScore // 0')
      if awk "BEGIN{exit !($sc < $THRESHOLD)}"; then
        echo "::error::${pkg}@${version} Scorecard score ${sc}/10 below threshold ${THRESHOLD}"
        FAILED=1
      fi
    done < /tmp/new-packages.txt
    exit "$FAILED"

For Renovate-managed repositories, add a custom manager configuration that calls a scoring webhook before merging dependency update PRs. The Renovate prBodyDefinitions and customManagers approach lets you embed the deps.dev score in the PR body automatically:

{
  "customManagers": [],
  "prBodyDefinitions": {
    "ScorecardScore": "{{#if depName}}{{depName}} score: check deps.dev{{/if}}"
  },
  "prBodyColumns": ["Package", "Change", "ScorecardScore"],
  "automerge": false,
  "automergeType": "pr"
}

Combined with a required status check from the Scorecard gate workflow, this blocks auto-merge until the check passes.

OSSF Allstar

Allstar is an OpenSSF GitHub App that enforces Scorecard-based policies continuously across your own organisation’s repositories. Unlike running Scorecard in CI — which gates your own changes — Allstar watches for configuration drift on existing repositories and opens issues or files a fix PR when a policy is violated.

Install Allstar from the GitHub Marketplace and configure it in an .allstar repository at the organisation level:

# .allstar/allstar.yaml
optConfig:
  optOutStrategy: false

issueLabel: allstar
issueFooter: "This issue was opened by Allstar. See https://github.com/ossf/allstar"
# .allstar/branch_protection.yaml
optConfig:
  optOutStrategy: false

rules:
  - pattern: "main"
    allowForcePushes: false
    requirePullRequestReviews:
      requiredApprovingReviewCount: 1
      dismissStaleReviews: true
    requireStatusChecks:
      strict: true
    enforceAdmins: true
# .allstar/dangerous_workflow.yaml
optConfig:
  optOutStrategy: false
# .allstar/binary_artifacts.yaml
optConfig:
  optOutStrategy: false

When Allstar detects a violation — a repository that has disabled branch protection, or a workflow file that has introduced a dangerous pattern — it opens an issue in that repository and optionally creates a fix PR. For organisations with dozens of repositories, Allstar enforces the floor without requiring each team to manage Scorecard individually.

Building a Dependency Risk Register

A dependency risk register combines three data sources: an SBOM with the full dependency inventory, Scorecard scores for each GitHub-hosted package, and CVE data from OSV or a vulnerability database. See SBOM generation and consumption for the mechanics of generating CycloneDX or SPDX SBOMs from your build system.

The risk register structure for each dependency entry:

{
  "name": "express",
  "version": "4.18.2",
  "ecosystem": "npm",
  "github_repo": "github.com/expressjs/express",
  "scorecard": {
    "composite": 6.2,
    "scored_at": "2026-05-09T00:00:00Z",
    "checks": {
      "Branch-Protection": 7,
      "Signed-Releases": 0,
      "Security-Policy": 9,
      "Pinned-Dependencies": 5,
      "Maintained": 10,
      "Dangerous-Workflow": 10,
      "Dependency-Update-Tool": 10
    }
  },
  "vulnerabilities": [],
  "risk_tier": "medium",
  "accepted_exception": false,
  "owner": "platform-team",
  "review_due": "2026-08-09"
}

Ingest this into Dependency-Track by combining SBOM import with a custom property for the Scorecard score. Dependency-Track’s REST API accepts component-level custom properties:

DT_URL="https://dependency-track.internal"
DT_KEY="your-api-key"
COMPONENT_UUID="$(curl -sf "${DT_URL}/api/v1/component?name=express&version=4.18.2" \
  -H "X-Api-Key: ${DT_KEY}" | jq -r '.[0].uuid')"

curl -sf -X PUT "${DT_URL}/api/v1/component/${COMPONENT_UUID}" \
  -H "X-Api-Key: ${DT_KEY}" \
  -H "Content-Type: application/json" \
  -d "{
    \"uuid\": \"${COMPONENT_UUID}\",
    \"properties\": [
      {\"groupName\": \"scorecard\", \"propertyName\": \"composite\", \"propertyValue\": \"6.2\"},
      {\"groupName\": \"scorecard\", \"propertyName\": \"signed_releases\", \"propertyValue\": \"0\"}
    ]
  }"

Use Dependency-Track’s policy engine to flag components where the scorecard.composite property falls below your threshold, co-located with any CVE findings. This creates a single dashboard where a component with a low Scorecard score and an open CVE — the highest-risk combination — is surfaced without manual correlation.

Risk tiers based on Scorecard composite score and CVE status:

Composite score Open CVEs Risk tier Action
>= 7 None Low Standard update cadence
>= 7 Present Medium Remediate within SLA
4–6 None Medium Enable watch notifications; review on each version bump
4–6 Present High Expedited remediation; evaluate alternatives
< 4 None High Formal exception required; compensating controls mandatory
< 4 Present Critical Block adoption or emergency migration

For packages that have no GitHub repository — available only on PyPI, npm, or Maven Central with no source link — Scorecard cannot run. Assign a synthetic score of 0 for the checks that require repository access and apply the High tier by default until a source repository is confirmed. This is a conservative default that protects against the typosquatting and package registry attacks where packages are published without a corresponding public source repository.

Refresh scores on a weekly schedule rather than per-commit. Scorecard results for stable projects change infrequently; scoring every commit wastes API quota. Score again whenever a new version of a dependency is adopted.

Limitations

Scorecard measures process maturity, not code correctness. A project can score 10 out of 10 and still ship a remote code execution vulnerability. Branch protection enforced, signed releases published, CI running on every PR, dependencies pinned — and a SQL injection in the authentication path that nobody tested for.

Conversely, a project can score 3 out of 10 and have an exemplary security track record because its maintainer is an experienced security engineer who reviews every line personally and responds to vulnerability reports within 24 hours without a formal policy document.

The correlation between high Scorecard scores and lower vulnerability density is real at the population level — projects with mature security processes are statistically less likely to have unpatched public vulnerabilities. But at the individual project level, the score is a process indicator, not an outcome guarantee.

What Scorecard cannot detect:

  • Logic vulnerabilities in the project’s source code — the checks do not analyse code semantics
  • Compromised maintainer accounts that have not yet been used to publish a malicious release
  • Supply chain attacks on the project’s own dependencies that have not yet been detected and disclosed
  • Private forks or unreleased patches that fix vulnerabilities not yet assigned a CVE
  • Registry-level attacks such as dependency confusion where the package name is hijacked at the registry layer rather than at the repository layer

Use Scorecard alongside, not instead of: SBOM-driven vulnerability scanning with osv-scanner or Grype, artifact signature verification with cosign, and Software Composition Analysis integrated into CI. Scorecard answers “does this project follow security-oriented processes?” — the other tools answer “does this specific version have known vulnerabilities?” and “is this artifact authentic?”. All three questions are necessary for a complete dependency risk posture.

The Maintained check is the most operationally critical signal that is not covered by traditional SCA tools. A project with no commits in 90 days and an open CVE is a dependency that has already failed — the question is when, not whether, you will need to migrate. Scorecard surfaces this before it becomes an incident.