Artifact Copy Integrity: Closing the Substitution Window in Multi-Stage Build Pipelines

Artifact Copy Integrity: Closing the Substitution Window in Multi-Stage Build Pipelines

Problem

Most pipelines treat artifact promotion as a logistics problem: get the built artifact from the build stage to where the deploy stage can find it. The security question — whether the artifact that arrives is the same artifact that was built — is almost never asked explicitly, and almost never enforced mechanically.

The typical multi-stage flow looks like this: a build job compiles source, produces an image or binary, pushes it to an intermediate store tagged by branch name or build number, and exits. A downstream test job pulls by that same tag, runs tests, and uploads a pass/fail result. A staging job pulls the same tag, deploys, and if smoke tests pass, promotes to production by re-tagging or re-uploading. A production job pulls the production-tagged artifact and deploys.

Each handoff in this chain shares a common property: the artifact is referenced by a mutable identifier — a tag, a filename, a path in an S3 prefix — and the receiving stage does not verify that what it received matches what the previous stage produced. Between any two stages, there is a window in which someone with write access to the intermediate store can replace the artifact. This is the substitution window.

The window is not theoretical. Real-world scenarios where it materialises:

S3 intermediate artifact storage with shared write IAM. A build job uploads build-artifacts/myapp-1.2.3.tar.gz to a shared S3 bucket. The IAM policy that permits the build job to upload also grants write access to other pipelines in the same account — or to an IAM role held by a third-party integration. Between the upload and the deploy job’s download, any principal with s3:PutObject on that prefix can overwrite the object. S3 does not version objects by default. The deploy job downloads and deploys the replacement.

Docker Hub push/pull race on mutable tags. A CI pipeline pushes myorg/myapp:main to Docker Hub, then triggers a downstream job that pulls myorg/myapp:main and scans it. Docker Hub tags are mutable. Another authenticated user with write access to the repository — a compromised machine credential, an overly broad team permission, a co-maintainer — can push a different image to the same tag in the gap between the upstream push and the downstream pull. The scan runs against the replacement.

Artifact registry with broad write IAM in a promotion flow. An artifact is promoted from a dev registry to a staging registry by pulling by tag and re-pushing to staging. Any principal with write access to the staging registry at promotion time can push to the same tag before the deploy job reads it.

npm publish-then-install race. A pipeline publishes an npm package, then a downstream job installs it for integration testing. The package name and version are mutable references: if the registry allows overwrite (some private registries do, or if the version tag was accidentally bumped back), the installed package differs from the published one.

The substitution window attack requires write access to the intermediate store, not to the build system itself. This is a significantly lower privilege threshold. Attackers who cannot compromise your CI runners, your source repositories, or your signing keys may still hold — or be able to obtain — write access to a shared S3 bucket, an artifact registry, or a Docker Hub organisation.

Threat Model

Compromised pipeline credential swapping artifact in object storage. A long-lived IAM access key or service account key with write access to the artifact bucket is exfiltrated from another pipeline that shares the same credentials. The attacker uses it to overwrite the artifact between build upload and deploy download. The deploy job downloads and deploys malware. Blast radius: full production compromise with no build system footprint.

Registry write access substituting a signed image tag between promotion jobs. An attacker obtains write access to the intermediate registry — through a compromised robot account, a leaked Docker Hub personal access token, or a misconfigured IAM role. The build job signed the image with cosign and pushed the signature to the registry. The attacker pushes a malicious image to the same mutable tag. The signature record in the registry now points to the original digest, but the tag resolves to the replacement. A downstream job that verifies the signature against the tag — rather than the digest — will verify a signature that matches neither the current tag referent nor the malicious payload. Pipelines that skip verification after the initial sign step will deploy the replacement without any alert.

MITM on artifact download in a CI runner. A CI runner pulls an artifact over HTTP, or over HTTPS with certificate verification disabled (--no-check-certificate, curl -k, requests with verify=False). A network-level attacker — possible in cloud environments with compromised VPCs or managed network appliances — substitutes the artifact in transit. No registry write access needed. Blast radius: any pipeline step that downloads artifacts without transport-layer or content-level verification.

Insider threat swapping artifact between scan pass and deployment. An internal operator with write access to the artifact store — legitimate for their role, such as a storage admin or a platform team member maintaining the CI infrastructure — modifies an artifact after it passes a vulnerability scan and before the deployment job retrieves it. The artifact carries a clean scan record that no longer reflects its actual content. Audit logs record the operator’s access but nothing in the pipeline detects the mismatch.

In every scenario, the common enabling condition is the same: a downstream pipeline stage reads an artifact by mutable reference without independently verifying that the content matches the digest produced upstream.

Configuration and Implementation

Content-Addressed Storage as the Foundation

The root fix is to replace mutable references with content-addressed references throughout the pipeline. A content-addressed reference is a digest: a cryptographic hash of the artifact’s content, typically SHA-256. Two artifacts with the same digest are identical. Any modification — even a single bit — produces a different digest.

For container images, the digest is the SHA-256 hash of the image manifest:

# Always push by tag initially (registries require a name), but immediately resolve to digest
docker push myregistry.example.com/myapp:main

# Capture the digest immediately after push — before any other job can reference the image
IMAGE_DIGEST=$(docker inspect --format='{{index .RepoDigests 0}}' myregistry.example.com/myapp:main)
# IMAGE_DIGEST = myregistry.example.com/myapp@sha256:abc123...

# All downstream references use the digest, never the tag
docker pull "${IMAGE_DIGEST}"

For binary artifacts stored in S3 or GCS, generate and record the digest at build time:

# Build job: generate artifact and its digest
sha256sum myapp-1.2.3.tar.gz > myapp-1.2.3.tar.gz.sha256

# Upload both artifact and digest file
aws s3 cp myapp-1.2.3.tar.gz s3://build-artifacts/myapp-1.2.3.tar.gz
aws s3 cp myapp-1.2.3.tar.gz.sha256 s3://build-artifacts/myapp-1.2.3.tar.gz.sha256

# Deploy job: download and verify before using
aws s3 cp s3://build-artifacts/myapp-1.2.3.tar.gz ./myapp-1.2.3.tar.gz
aws s3 cp s3://build-artifacts/myapp-1.2.3.tar.gz.sha256 ./myapp-1.2.3.tar.gz.sha256
sha256sum --check myapp-1.2.3.tar.gz.sha256

The .sha256 sidecar file must itself be protected. If the attacker can overwrite both the artifact and its digest file, the check is defeated. S3 object versioning with MFA delete, combined with separate write permissions for the digest file, provides the necessary protection. Alternatively, embed the expected digest in the pipeline configuration itself rather than downloading it from the same store.

Per-Stage cosign verify-blob

For binary artifacts, cosign can sign and verify arbitrary blobs, not just container images. Sign at build time, verify at every subsequent stage:

# Build job: sign the artifact
cosign sign-blob \
  --key cosign.key \
  --output-certificate artifact.pem \
  --output-signature artifact.sig \
  myapp-1.2.3.tar.gz

# Upload artifact, signature, and certificate to S3
aws s3 cp myapp-1.2.3.tar.gz s3://build-artifacts/
aws s3 cp artifact.sig s3://build-artifacts/myapp-1.2.3.tar.gz.sig
aws s3 cp artifact.pem s3://build-artifacts/myapp-1.2.3.tar.gz.pem
# Every downstream job (test, staging deploy, production deploy): verify before use
aws s3 cp s3://build-artifacts/myapp-1.2.3.tar.gz ./myapp-1.2.3.tar.gz
aws s3 cp s3://build-artifacts/myapp-1.2.3.tar.gz.sig ./artifact.sig
aws s3 cp s3://build-artifacts/myapp-1.2.3.tar.gz.pem ./artifact.pem

cosign verify-blob \
  --key cosign.pub \
  --signature artifact.sig \
  --certificate artifact.pem \
  myapp-1.2.3.tar.gz

# Only proceed if verify-blob exits 0
echo "Artifact verified, proceeding with deployment"

For keyless signing using Sigstore’s Fulcio CA (appropriate for GitHub Actions OIDC flows):

# Build job: keyless sign using OIDC identity
cosign sign-blob \
  --oidc-issuer https://token.actions.githubusercontent.com \
  --output-certificate artifact.pem \
  --output-signature artifact.sig \
  myapp-1.2.3.tar.gz
# Deploy job: verify against expected OIDC identity (repository, not just issuer)
cosign verify-blob \
  --certificate-identity "https://github.com/myorg/myapp/.github/workflows/build.yml@refs/heads/main" \
  --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
  --signature artifact.sig \
  --certificate artifact.pem \
  myapp-1.2.3.tar.gz

Specifying --certificate-identity pins the verification to the exact workflow that produced the artifact. Omitting it and relying only on --certificate-oidc-issuer allows any workflow in any repository on GitHub Actions to produce a valid signature — which is not the intended trust boundary.

crane digest and crane copy for Content-Addressed Container Promotion

Container image promotion by re-tagging is a substitution window. The correct tool for content-addressed promotion is crane, which operates on digests throughout:

# Resolve the source tag to a digest immediately after the build job pushes it
SOURCE_DIGEST=$(crane digest myregistry.example.com/myapp:main)
# SOURCE_DIGEST = sha256:abc123...

# Copy by digest to staging registry — crane copy uses the digest internally
crane copy \
  "myregistry.example.com/myapp@${SOURCE_DIGEST}" \
  "staging-registry.example.com/myapp:staging"

# Verify the copied image has the expected digest in staging
STAGING_DIGEST=$(crane digest staging-registry.example.com/myapp:staging)

if [ "${SOURCE_DIGEST}" != "${STAGING_DIGEST}" ]; then
  echo "FATAL: digest mismatch after promotion. Expected ${SOURCE_DIGEST}, got ${STAGING_DIGEST}"
  exit 1
fi

echo "Promotion verified: ${STAGING_DIGEST}"

crane copy preserves the digest when the image is copied between registries with compatible manifest formats. The post-copy digest check is a belt-and-suspenders verification that catches cases where the registry rewrites the manifest (e.g., converting between OCI and Docker v2 formats), which changes the digest.

GitHub Actions: Passing Digests Between Jobs

GitHub Actions jobs run in isolated environments. The naive approach — each job pulls the artifact by tag — reintroduces the substitution window on every job boundary. Pass the digest through job outputs instead:

jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      image-digest: ${{ steps.push.outputs.digest }}
    steps:
      - name: Build image
        run: docker build -t myregistry.example.com/myapp:main .

      - name: Push image
        id: push
        run: |
          docker push myregistry.example.com/myapp:main
          DIGEST=$(docker inspect --format='{{index .RepoDigests 0}}' myregistry.example.com/myapp:main | cut -d'@' -f2)
          echo "digest=${DIGEST}" >> "$GITHUB_OUTPUT"

      - name: Sign image
        run: |
          cosign sign \
            --key cosign.key \
            "myregistry.example.com/myapp@${{ steps.push.outputs.digest }}"

  test:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - name: Pull by digest (not tag)
        run: |
          docker pull "myregistry.example.com/myapp@${{ needs.build.outputs.image-digest }}"

      - name: Verify signature before test
        run: |
          cosign verify \
            --key cosign.pub \
            "myregistry.example.com/myapp@${{ needs.build.outputs.image-digest }}"

      - name: Run tests
        run: |
          docker run --rm \
            "myregistry.example.com/myapp@${{ needs.build.outputs.image-digest }}" \
            ./run-tests.sh

  deploy-staging:
    needs: [build, test]
    runs-on: ubuntu-latest
    steps:
      - name: Verify signature before staging deploy
        run: |
          cosign verify \
            --key cosign.pub \
            "myregistry.example.com/myapp@${{ needs.build.outputs.image-digest }}"

      - name: Promote to staging registry by digest
        run: |
          crane copy \
            "myregistry.example.com/myapp@${{ needs.build.outputs.image-digest }}" \
            "staging-registry.example.com/myapp:staging"

The digest propagates through needs.<job>.outputs.<name> and is never re-derived by querying the registry. Every job that uses the image references it by the digest captured at push time. The substitution window between jobs is closed.

Tekton: Chaining Digests Through Task Results

In Tekton, Task results carry data between Tasks in a Pipeline. Use them to pass digests rather than allowing downstream tasks to re-derive the artifact reference:

apiVersion: tekton.dev/v1
kind: Task
metadata:
  name: build-and-push
spec:
  results:
    - name: image-digest
      description: SHA256 digest of the built image
  steps:
    - name: build-push
      image: gcr.io/kaniko-project/executor:latest
      args:
        - "--destination=myregistry.example.com/myapp:$(context.pipelineRun.name)"
        - "--digest-file=$(results.image-digest.path)"
---
apiVersion: tekton.dev/v1
kind: Task
metadata:
  name: verify-and-deploy
spec:
  params:
    - name: image-digest
      description: Digest to verify and deploy
  steps:
    - name: verify
      image: gcr.io/projectsigstore/cosign:latest
      script: |
        cosign verify \
          --key k8s://cosign-ns/cosign-pub \
          "myregistry.example.com/myapp@$(params.image-digest)"

    - name: deploy
      image: bitnami/kubectl:latest
      script: |
        kubectl set image deployment/myapp \
          "myapp=myregistry.example.com/myapp@$(params.image-digest)"
---
apiVersion: tekton.dev/v1
kind: Pipeline
metadata:
  name: build-verify-deploy
spec:
  tasks:
    - name: build
      taskRef:
        name: build-and-push

    - name: deploy
      taskRef:
        name: verify-and-deploy
      params:
        - name: image-digest
          value: "$(tasks.build.results.image-digest)"
      runAfter:
        - build

Kaniko writes the digest of the pushed image to the path specified by --digest-file. Tekton captures this as a Task result and makes it available to downstream tasks via the $(tasks.<task-name>.results.<result-name>) substitution. The deploy task never queries the registry for the current digest of the tag — it uses only the digest produced by the build task.

SLSA Provenance Verification at Deploy Time

SLSA provenance provides a stronger guarantee than digest pinning alone: it links the artifact digest to the source commit, the build configuration, and the build platform identity. Verifying provenance at deploy time detects not just substitution but also artifacts built from unexpected sources.

# Install slsa-verifier
go install github.com/slsa-framework/slsa-verifier/v2/cli/slsa-verifier@latest

# At deploy time, verify the artifact's provenance before deployment
slsa-verifier verify-artifact myapp-1.2.3.tar.gz \
  --provenance-path myapp-1.2.3.intoto.jsonl \
  --source-uri github.com/myorg/myapp \
  --source-tag v1.2.3

# For container images
slsa-verifier verify-image \
  "myregistry.example.com/myapp@sha256:abc123..." \
  --source-uri github.com/myorg/myapp \
  --source-branch main

--source-uri restricts valid provenance to artifacts built from the specified repository. --source-tag or --source-branch restricts to artifacts built from the specified ref. Provenance that does not match — because the artifact was built from a different repository, a different branch, or not built at all by the expected builder — causes slsa-verifier to exit non-zero and the deployment to halt.

Immutable Tags and Registry Policy Enforcement

Immutable tags at the registry level eliminate the substitution window for images in registries that enforce them. When a tag is immutable, a push of a different image to the same tag is rejected — the attacker cannot overwrite the artifact even if they hold write credentials.

In AWS ECR:

aws ecr put-image-tag-mutability \
  --repository-name myapp \
  --image-tag-mutability IMMUTABLE \
  --region us-east-1

In Google Artifact Registry, configure a tag policy on the repository:

gcloud artifacts repositories update myapp-repo \
  --location=us-central1 \
  --update-labels="tag-policy=immutable"

In Harbor, enable Prevent vulnerable images from running and set tag immutability rules under the project’s Policy tab. Harbor’s immutability rules use regex matching on tag names; a rule matching ** makes all tags in the project immutable.

Immutable tags do not eliminate the need for digest-based references in pipelines. A pipeline that still references by tag can pull any of the images that were pushed to that tag before immutability was enabled. Digest-based references remain the correct approach; immutable tags are a defence-in-depth control that prevents the registry itself from being used as the substitution vector.

Expected Behaviour

The table below shows the substitutability of each artifact reference type and the verification action required at each pipeline stage.

Reference type Example Substitutable after push? Required verification
Mutable tag myapp:main Yes — any authorised principal can overwrite Verify digest or signature at every read
Build number tag myapp:build-1234 Yes — tag is unique but still mutable Verify digest or signature at every read
Immutable tag (registry-enforced) myapp:v1.2.3 (ECR immutable) No — registry rejects overwrites Verify digest or signature at first read; registry enforces thereafter
Content-addressed digest myapp@sha256:abc123 No — digest is content-derived Digest itself is the integrity check; verify signature for authenticity
Digest + cosign signature myapp@sha256:abc123 + signature in registry No — digest immutable, signature verifiable cosign verify confirms authenticity; digest confirms integrity

Verification commands per stage:

Stage Artifact type Verification command
Build output Container image crane digest immediately after push; write to job output
Build output Binary/tarball sha256sum at build time; cosign sign-blob
Test Container image cosign verify myregistry/myapp@<digest> before any use
Test Binary/tarball cosign verify-blob before any use
Staging promote Container image crane copy by digest; post-copy crane digest comparison
Staging deploy Container image cosign verify + slsa-verifier verify-image
Production deploy Container image cosign verify + slsa-verifier verify-image with --source-tag
Production deploy Binary/tarball cosign verify-blob + slsa-verifier verify-artifact

Trade-offs

Approach Benefit Cost
Digest-only references in pipelines Closes the tag-substitution window completely Images cannot be referenced by human-readable names in pipeline YAML; digest must be propagated through job outputs or pipeline parameters, adding pipeline complexity
Per-stage cosign verification Every stage independently confirms artifact authenticity; compromised stage N cannot silently pass a modified artifact to stage N+1 Each verification adds 2–5 seconds to stage startup; keyless verification requires network access to Rekor and Fulcio at every stage; key-based verification requires secret distribution to every stage
Immutable tags at registry level Prevents registry from being used as substitution vector without changing pipeline code Tags can never be reused for a different image; pipelines that assume mutable tags (e.g., :latest or :main) require refactoring; tag garbage collection becomes more complex because old tags cannot be overwritten
Digest passed through job outputs (not re-derived) Eliminates intra-pipeline substitution window Job output size limits apply (GitHub Actions: 1 MB per output); digest must be explicitly threaded through every job dependency chain; pipeline YAML becomes more verbose
SLSA provenance verification at deploy Links artifact to source commit and build configuration; detects artifacts built outside expected pipeline Requires provenance generation at build time (adds ~30s for slsa-github-generator); requires slsa-verifier installed in deploy environment; provenance verification adds 5–15s per deployment
S3 SHA-256 sidecar files Works with any object storage without registry features Sidecar can be overwritten alongside artifact if IAM is too broad; requires separate protection for the digest file itself; adds two S3 API calls per artifact operation

Failure Modes

Failure mode How it manifests Detection and remediation
Digest verification skipped “just this once” A developer comments out cosign verify to unblock a deployment, leaves it commented out, and the change is merged to the pipeline definition Pipeline-as-code review required on all CI configuration changes; OPA or Conftest policy that rejects pipeline YAML without a verify step before every deploy stage
Immutable tag policy bypassed by registry admin A registry administrator with sufficient privilege pushes to a tag declared immutable, either by API or by temporarily disabling the policy Registry audit logging enabled and shipped to SIEM; alert on any PutImageTagMutability API call that sets mutability to MUTABLE; require break-glass approvals for registry admin operations
cosign verification uses wrong identity cosign verify is called with --certificate-oidc-issuer but without --certificate-identity; any workflow on the same OIDC provider can produce a valid signature Require --certificate-identity in all cosign verify invocations; Conftest policy that rejects cosign invocations without identity pinning; test the verification command with a signature from a different repository and confirm it fails
Digest mismatch after cross-registry copy silently ignored crane copy exits 0 but the post-copy digest check is not implemented or its exit code is not checked Make post-copy digest comparison mandatory and fail the pipeline on mismatch; use set -e in shell scripts so unchecked failures propagate
SLSA provenance missing at deploy time The build job failed to generate or upload provenance; the deploy job skips slsa-verifier when provenance is absent Treat missing provenance as a verification failure, not a skip; require provenance file upload as a gated step in the build job before the build job can exit 0
Job output digest truncated or corrupted A long digest is truncated by job output size limits or character encoding issues; the downstream job pulls by a malformed digest and falls back to a tag Validate digest format (sha256:[a-f0-9]{64}) at both write time (build job) and read time (deploy job) before use; fail the job if the format is invalid
SHA-256 sidecar overwritten by attacker An attacker with s3:PutObject overwrites both the artifact and its .sha256 sidecar; the download job verifies successfully against the replacement digest Store expected digest in the pipeline configuration itself or in a write-protected location separate from the artifact store; enable S3 Object Lock on the digest sidecar prefix