API Key Lifecycle at Scale: Issuance, Rotation, Scoping, and Audit Across Cloud and SaaS

Problem

API keys leak. The 2024 GitGuardian “State of Secrets Sprawl” report found 23+ million secrets exposed in public GitHub commits over the year — a substantial fraction of those API keys for production systems. The 2025 figure was higher.

Most API-key-leak incidents share a shape: a key was issued for a one-off purpose, never rotated, never reviewed, eventually committed to a public repo or copied to an attacker via some other vector. The leak isn’t surprising; the cleanup is hard because nobody knows what the key was for or what to revoke.

By 2026 mature programs treat API keys as tracked, lifecycle-managed objects:

Issued with explicit purpose, owner, and scope.
Used with auditable per-call attribution.
Rotated on schedule and on event (employee departure, suspected compromise).
Revoked automatically at end-of-life or on detection.
Discoverable — every key has a system-of-record entry showing where it’s used.

The scale problem: a typical org has 10,000-100,000 API keys across cloud providers (AWS, GCP, Azure), SaaS (Stripe, Twilio, Datadog, GitHub, OpenAI), internal services, and legacy systems. Per-key manual lifecycle is impossible.

The specific gaps in unmanaged programs:

New keys minted via UI clicks; no record of purpose.
Keys live in .env files, committed to repos, screenshotted in Slack.
Rotation never happens because “this key is in production, can’t risk breaking it.”
Departed employees’ keys persist for months / years.
No audit trail of which key did what — provider logs show “key X did Y” but X has no description.
Scope often defaults to admin / cluster-wide / unrestricted.

This article covers the lifecycle stages, secret-scanning prevention, programmatic issuance with a system-of-record, automated rotation, and the operational integration with employee onboarding / offboarding.

Target systems: Vault, AWS IAM Identity Center, GCP IAM, Azure Entra, GitGuardian / TruffleHog for scanning, Akeyless / Doppler / Infisical / 1Password for secret management.

Threat Model

Adversary 1 — Public-repo leak: an engineer commits an API key to a public repo; an attacker scans GitHub and uses it.
Adversary 2 — Insider threat: an employee with key access uses it for unauthorized purpose; key has no per-employee attribution.
Adversary 3 — Endpoint compromise: malware exfiltrates .env files or developer credentials.
Adversary 4 — SaaS provider compromise: a SaaS provider is breached; their copy of your API key for them is exposed.
Adversary 5 — Departed employee: former employee retains keys they personally created; uses them after leaving.
Access level: Adversaries 1, 3, 5 have laptop / repo access. Adversaries 2, 4 are insider / vendor-side.
Objective: Use the API key to impersonate the legitimate owner; read or modify data; pivot to other systems.
Blast radius: an unmanaged key with broad scope = full access to the resource indefinitely. With lifecycle management: bounded scope, bounded lifetime, attribution per use.

Configuration

Step 1: System of Record

Every key in your environment has an entry in a central registry.

# api-key-registry/keys/payments-stripe-live.yaml
key_id: payments-stripe-live-001
description: "Stripe live API key for payments service"
owner_team: payments
owner_individual: alice@example.com
created_at: 2026-04-01T10:00:00Z
expiration: 2027-04-01T10:00:00Z
last_rotated: 2026-04-01T10:00:00Z
scope:
  provider: stripe
  account_id: acct_xxx
  permissions: ["charges:write", "customers:read"]
  webhook_endpoints: ["https://api.example.com/stripe-webhook"]
storage:
  vault_path: "secret/payments/stripe-live"
  consumed_by:
    - kubernetes_secret: "payments-stripe-credentials"
      namespace: payments
audit:
  rotation_owner: payments-team
  rotation_frequency_days: 365
  last_audit: 2026-04-01

Every API key has a YAML in this registry. CI validates that:

Every key has an owner (team + individual).
Every key has an expiration < 1 year.
Every key has a documented purpose and scope.
The Vault path actually exists.

Without a registry entry, the key shouldn’t exist. Onboarding a new key without registry creation is a policy violation.

Step 2: Programmatic Issuance

When teams need a new API key, the workflow goes through automation, not the SaaS UI:

# issue_key.py
import argparse, yaml, requests, datetime

def issue_stripe_key(team, purpose, scope):
    # Generate key in Stripe via API.
    response = requests.post(
        f"https://api.stripe.com/v1/api_keys",
        auth=(STRIPE_PROVISIONING_KEY, ""),
        data={
            "name": f"{team}-{purpose}-{datetime.date.today()}",
            "role": scope,   # restricted role
        },
    )
    new_key = response.json()
    key_id = new_key["id"]
    secret = new_key["secret"]

    # Write to Vault.
    vault_path = f"secret/{team}/stripe-{purpose}"
    requests.post(
        f"{VAULT_ADDR}/v1/{vault_path}",
        headers={"X-Vault-Token": VAULT_TOKEN},
        json={"data": {"key": secret, "key_id": key_id}},
    )

    # Register in system-of-record.
    registry_entry = {
        "key_id": key_id,
        "description": purpose,
        "owner_team": team,
        "created_at": datetime.datetime.now(datetime.timezone.utc).isoformat(),
        "expiration": (datetime.datetime.now() + datetime.timedelta(days=365)).isoformat(),
        "scope": scope,
        "storage": {"vault_path": vault_path},
    }
    register_in_git(registry_entry)

if __name__ == "__main__":
    # CLI for engineers to request a new key.
    ...

The flow: engineer runs ./issue_key.py --team payments --purpose webhooks --scope charges:write. Issuance, storage, and registry update happen atomically. No clicking in the Stripe UI.

Step 3: Rotation Automation

Every key rotates on schedule. A controller monitors the registry:

# rotation_controller.py
import datetime, yaml, glob

def find_due_for_rotation():
    due = []
    for path in glob.glob("api-key-registry/keys/*.yaml"):
        entry = yaml.safe_load(open(path))
        last_rot = datetime.datetime.fromisoformat(entry["last_rotated"])
        freq = entry["audit"]["rotation_frequency_days"]
        if (datetime.datetime.now() - last_rot).days >= freq:
            due.append(entry)
    return due

def rotate_key(entry):
    # Mint new key.
    new_secret = provision_new_key(entry)
    # Write to Vault.
    write_to_vault(entry["storage"]["vault_path"], new_secret)
    # Wait for consumers to pick up (via External Secrets Operator).
    wait_for_consumers(entry, timeout_minutes=15)
    # Verify new key works (via a synthetic test).
    if not synthetic_test_passes(new_secret, entry):
        raise RotationError("Synthetic test failed; aborting rotation")
    # Revoke old key.
    revoke_old_key(entry["key_id"])
    # Update registry.
    entry["last_rotated"] = datetime.datetime.now(datetime.timezone.utc).isoformat()
    update_registry(entry)

# Run nightly via cron.
for entry in find_due_for_rotation():
    rotate_key(entry)

The flow handles dual-write windows: new key written, consumers refresh (via External Secrets Operator + reload), old key revoked once consumers are confirmed using the new.

For SaaS providers without an issuance API, fall back to manual rotation with structured tickets:

# rotation-ticket-template.md (Jira / Linear / GitHub Issue).
title: "Rotate {key_id} (due {due_date})"
body: |
  Key {key_id} is due for rotation by {due_date}.
  Owner: {owner_individual} / {owner_team}
  Provider: {provider}
  Vault path: {vault_path}

  Steps:
  - [ ] Mint new key in {provider} UI
  - [ ] Update Vault path
  - [ ] Confirm consumers using new key (synthetic test)
  - [ ] Revoke old key in {provider} UI
  - [ ] Update registry's last_rotated timestamp

Manual rotation tickets are tracked in your normal ticketing system; SLA enforced by engineering management.

Step 4: Pre-Commit Secret Scanning

Prevent leaks at commit time.

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.20.0
    hooks:
      - id: gitleaks
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.5.0
    hooks:
      - id: detect-secrets

gitleaks and detect-secrets catch most patterns. For your org’s specific key formats, write custom regex:

# .gitleaks.toml
[allowlist]
description = "Allowlist for known test fixtures"
paths = [
  "tests/fixtures/.*",
]

[[rules]]
id = "myorg-api-key"
description = "MyOrg internal API key"
regex = '''myorg_(?:live|test)_[a-z0-9]{32}'''
keywords = ["myorg_"]

CI also runs the scanner; PRs containing secrets are blocked.

Step 5: Secret Scanning Across Existing Repos

For repos that may already contain leaked secrets, run a one-time and ongoing scan:

# One-time scan of historical commits.
gitleaks dir --source=/path/to/repo --report-path=/tmp/secrets.json

# Ongoing: GitHub's secret scanning (free for public repos; commercial Advanced Security for private).
# Or use TruffleHog OSS:
trufflehog git --repo=https://github.com/myorg/payments --json

For every finding: rotate, attribute, audit. Push notifications integrate into your ticketing.

Step 6: Per-Use Attribution Where Possible

Some providers attribute API calls to a specific key in their audit logs. Where possible, every team / service has its own key — never shared.

# Bad: one OpenAI key for the whole org.
key_id: openai-shared
owner_team: shared
consumed_by: [payments, recommendations, search, chat]   # all teams

# Good: per-team keys.
key_id: openai-payments
owner_team: payments
consumed_by: [payments]
key_id: openai-recommendations
owner_team: recommendations
consumed_by: [recommendations]

When OpenAI’s audit shows “key abc123 made a request,” you immediately know it’s the payments team’s key. Cross-team attribution requires per-team keys.

Step 7: Offboarding Flow

When an employee leaves, their personal-issued keys must be enumerated and rotated. Hook into your IDP:

# onboarding_offboarding.py
def employee_offboarded(user_email):
    # Find all keys in the registry where this employee was the individual owner.
    keys = registry.find(owner_individual=user_email)
    for k in keys:
        # Reassign to team, then rotate.
        k["owner_individual"] = k["owner_team"] + "-shared"
        rotate_key(k)   # generates new credential the departing employee never sees
    # Also revoke any session tokens the user had with internal SaaS.
    for sess in identity_provider.list_sessions(user_email):
        identity_provider.revoke(sess)

Triggered automatically when HR’s offboarding signal reaches the IDP.

Step 8: Telemetry and Audit

api_key_total{provider, owner_team}
api_key_age_days{key_id}
api_key_due_for_rotation_total
api_key_rotation_success_total
api_key_rotation_failure_total{reason}
api_key_secret_scanner_findings_total{repo, scanner}
api_key_unauthorized_use_detected_total{key_id}

Alert on:

api_key_age_days{...} > expiration — overdue rotation; team escalation.
api_key_secret_scanner_findings_total non-zero — leaks pending response.
api_key_unauthorized_use_detected_total non-zero — stolen-key incident.

Step 9: Enrichment for Provider Audit

Cross-correlate provider audit logs with your registry:

-- Stripe audit log shows key acct_xxx_yyy made an API call.
-- Look up registry to find who owns that key.
SELECT
    audit.timestamp,
    audit.key_id,
    audit.api_called,
    audit.client_ip,
    registry.owner_team,
    registry.owner_individual,
    registry.purpose
FROM stripe_audit AS audit
JOIN api_key_registry AS registry ON audit.key_id = registry.key_id
WHERE audit.timestamp > now() - interval '7 days'
ORDER BY audit.timestamp DESC;

A provider audit entry without a corresponding registry row is unusual — investigate. An entry whose source IP doesn’t match the expected service is unusual — investigate.

Step 10: Quarterly Audit

Run quarterly:

All keys have non-stale registry entries (last_audit < 90 days).
All keys have rotation within their declared frequency.
All keys have a current owner (team and individual exist in HR).
Any keys without observable use in 30+ days — candidate for revocation.
Any keys with broad scope (admin, cluster-wide) — review necessity.

Expected Behaviour

Signal	Without lifecycle management	With
Time to detect key leak	Often months	Within hours (secret-scanning + audit-log anomaly)
Time to rotate after employee leaves	Indefinite	Hours (offboarding flow)
Number of unaccounted-for keys	Unknown; likely many	Approximately zero
Per-key attribution	Often “shared org” key	Per-team / per-service
Rotation frequency	“When something breaks”	Annual (or shorter) on schedule
Audit trail	Provider-side only	Provider + registry + per-use

Trade-offs

Aspect	Benefit	Cost	Mitigation
Programmatic issuance	Tracked from creation	Engineering effort to build	One-time investment; reuse for many providers.
Per-team keys	Attribution + isolation	Many keys to manage	Lifecycle automation handles bulk; cost is registry storage.
Auto-rotation	No stale credentials	Some provider limitations on automation	Hybrid: automate where APIs allow; ticket-driven for the rest.
Pre-commit scanning	Catches at source	False positives	Tune patterns; allowlist test fixtures.
Quarterly audit	Catches drift	Engineering time	Automate the report; review takes 1-2 hours.
Offboarding integration	Departed employees’ keys revoked	Identity-provider integration	Standard with IDP webhooks; same flow as user-account offboarding.

Failure Modes

Failure	Symptom	Detection	Recovery
Rotation breaks production	App can’t authenticate after rotation	Synthetic monitor or app errors	Automated rotation should detect via synthetic; revert to old key if verification fails. Fix and re-rotate.
Registry drift	Keys exist in providers but not in registry	Quarterly audit reconciliation	Add to registry; investigate why created out-of-band.
Secret scanner false negative	Leaked key not detected	External notification (GitHub, Have I Been Pwned)	Treat as incident; rotate immediately; investigate detection gap.
Provider rate-limits issuance	Bulk rotation fails	Provider returns 429	Spread rotation across hours; respect rate limits.
Offboarding integration miss	Departed employee’s key persists	Periodic IDP-vs-registry comparison	Manual cleanup; fix integration.
Synthetic test gives false-pass	Rotation completes but new key doesn’t actually work	Apps fail despite rotation	Improve synthetic to test the actual production code path.