Internal Developer Platform Security: Securing the Self-Service Infrastructure Layer

Internal Developer Platform Security: Securing the Self-Service Infrastructure Layer

Why the IDP Is a High-Value Target

An Internal Developer Platform aggregates the keys to the kingdom. Backstage connects to your version control system, secret store, cloud provider APIs, CI/CD pipelines, and Kubernetes clusters — typically using credentials that span environments. The scaffolder provisions cloud resources. The catalog aggregates deployment targets, secret references, and ownership metadata. TechDocs renders content from source repositories. Plugins communicate with downstream services using service accounts with significant permissions.

The threat model is not abstract. A compromised IDP can:

  • Provision cloud resources in production using the scaffolder’s cloud credentials
  • Extract secrets from Vault, AWS SSM, or Kubernetes secrets surfaced through catalog metadata or plugin config
  • Enumerate the entire internal service estate including security-sensitive services
  • Inject malicious scaffolder templates that execute during template selection, not just during scaffolding
  • Pivot from developer access to platform-engineer access if RBAC is absent or coarse-grained

The blast radius of IDP compromise is bounded by the permissions granted to IDP service accounts and the segregation between IDP components. Most default Backstage deployments fail on both counts. The IDP is not just another internal application — it is an orchestration plane that sits above your infrastructure and inherits the aggregate permissions of everything it provisions.

IDP Threat Model

Actors and Capabilities

A realistic IDP threat model has four actor categories with distinct capability sets:

Unauthenticated external attacker. Relevant if the Backstage instance is internet-accessible, if guest mode is enabled, or if a plugin exposes an unauthenticated endpoint. The attack surface includes the Backstage backend API (/api/*), the scaffolder REST API, the TechDocs rendering pipeline, and any plugin endpoints that skip auth middleware.

Authenticated developer with default access. In a default Backstage deployment with no RBAC policy, an authenticated user can invoke the scaffolder with any registered template, read all catalog entities, and call plugin APIs. Overly permissive scaffolder templates allow a developer to provision cloud resources in environments they would not otherwise have access to.

Compromised plugin or dependency. Backstage plugins run in the same Node.js process as the backend. A malicious or compromised plugin — including one installed via the npm supply chain — has access to all environment variables, the file system, and in-process secrets. Plugin supply chain compromise is the highest-probability path to full IDP credential exfiltration.

Template author with scaffolder access. Scaffolder templates are YAML documents executed with the permissions of the scaffolder service account. A developer with write access to a repository containing a registered catalog entity can modify the template to execute arbitrary actions within the scaffolder’s permission boundaries.

Blast Radius

Map your scaffolder service account’s cloud permissions to understand the provisioning blast radius. If the scaffolder uses an AWS IAM role with ec2:RunInstances, iam:CreateRole, or eks:CreateCluster, a malicious template can provision attacker-controlled infrastructure. If the scaffolder has write access to all Kubernetes namespaces, it can create privileged workloads.

The catalog read permission determines the information disclosure blast radius. The Backstage catalog typically contains service owner data, cloud account IDs, Vault secret paths, and dependency graphs. Without entity-level access control, every authenticated user reads all of it.

Backstage Architecture and Attack Surfaces

Component Breakdown

Catalog ingests catalog-info.yaml files from source repositories and exposes a queryable API over the resulting entity graph. Attack surface: YAML parsing, catalog API endpoints, and the polling/webhook mechanism used to detect repository changes.

Scaffolder executes templates against registered actions. Templates are YAML documents describing parameter collection and action sequences. Built-in actions cover filesystem, git, HTTP, and Kubernetes resource creation. Custom actions run in the same Node.js process. Attack surface: template YAML parsing, action execution, and credentials used by individual actions.

TechDocs fetches documentation from git repositories and builds static sites using MkDocs. The build executes in a subprocess from configuration fetched from the source repository. Attack surface: the mkdocs.yml file, any MkDocs plugins it specifies, and the subprocess execution environment.

Plugins are npm packages installed into the Backstage frontend and backend. Backend plugins run in the same Node.js process as the backend, with access to all secrets in app-config.yaml.

The Backend-Frontend Trust Boundary

Frontend-to-backend communication relies on tokens issued by the auth provider. Plugins that add routes to the backend must explicitly opt into auth middleware — there is no framework-level default that requires authentication on all routes. Reviewing each backend plugin for unauthenticated route exposure is a required hardening step.

Backstage Authentication

OIDC Configuration

Configure Backstage with an OIDC provider rather than the built-in development auth. For production deployments with GitHub authentication:

auth:
  environment: production
  providers:
    github:
      production:
        clientId: ${GITHUB_OAUTH_CLIENT_ID}
        clientSecret: ${GITHUB_OAUTH_CLIENT_SECRET}
        signIn:
          resolvers:
            - resolver: emailMatchingUserEntityProfileEmail

The signIn.resolvers configuration controls how an authenticated identity maps to a Backstage user entity. Using emailMatchingUserEntityProfileEmail means only users who exist as User entities in the catalog can sign in — an attacker who authenticates via OAuth but lacks a corresponding catalog entity is rejected. This is preferable to usernameMatchingUserEntityName, which accepts any authenticated GitHub user.

Restrict the GitHub OAuth app’s callback URL to your internal IDP hostname and ensure the OAuth app is owned by your organization.

Disabling Guest Mode

Guest mode allows unauthenticated access to Backstage. It is enabled by default in the development configuration and must be explicitly disabled in production:

auth:
  environment: production
  providers: {}

Ensure no provider named guest appears in app-config.production.yaml. Guest mode grants unauthenticated read access to the entire catalog, all TechDocs, and any plugin endpoints that rely on the framework’s default auth check.

API Key Management for Service-to-Service Auth

Backend plugins and external integrations authenticate to Backstage using backend tokens. Configure static backend tokens for service-to-service calls with rotation in mind:

backend:
  auth:
    keys:
      - secret: ${BACKSTAGE_BACKEND_SECRET}

Rotate BACKSTAGE_BACKEND_SECRET on a schedule using your secret manager. Never embed static secrets in committed app-config.yaml files. Use environment variable substitution (${VAR}) and inject values at runtime from Vault, AWS SSM, or Kubernetes secrets.

Catalog Entity Integrity

YAML Injection in catalog-info.yaml

The catalog ingests YAML from source repositories. A developer who can modify catalog-info.yaml in a registered repository can inject arbitrary YAML structures. The catalog parser validates against JSON Schema for known entity types, but validation gaps in custom types or plugin extensions can accept unexpected fields.

The more significant risk is Location entities, which point to additional catalog files:

apiVersion: backstage.io/v1alpha1
kind: Location
targets:
  - ./other-catalog-info.yaml
  - https://internal.example.com/shared-catalog.yaml

An attacker who can create or modify a Location entity can point the ingestion pipeline at an attacker-controlled HTTPS endpoint. The catalog will fetch and ingest whatever YAML is served. Restrict allowed Location targets in the catalog processor configuration:

catalog:
  locations: []
  rules:
    - allow: [Component, API, Group, User, Resource, System, Domain]
  providers:
    github:
      myOrg:
        organization: my-org
        catalogPath: /catalog-info.yaml
        filters:
          branch: main
  processingInterval: { minutes: 10 }

Configure catalog.rules to allow only entity kinds you actually use. Combine this with a GitHub catalog provider that ingests only from your organization’s repositories on the main branch, rather than accepting manually registered Location URLs from arbitrary sources.

Validating Entity Definitions

Validate catalog-info.yaml in CI before merge using the @backstage/cli:

npx @backstage/cli catalog:validate --path ./catalog-info.yaml

This catches malformed entities before they reach the catalog processor.

Scaffolder Security

Template Injection Risks

Scaffolder templates use the Nunjucks templating engine for parameter interpolation. If a custom action passes user-supplied values to a shell command or constructs file paths from template variables without sanitization, template injection becomes code execution within the scaffolder process.

Consider a custom action that passes a user-supplied parameter directly to a shell command:

async handler(ctx) {
  await exec(`deploy.sh --env ${ctx.input.environment}`);
}

If environment is not validated against an allowlist, a user can supply production; rm -rf /workspace and execute arbitrary commands. Use parameterized subprocess calls (execFile with argument arrays) rather than string interpolation into shell commands, and validate all user-supplied values before use.

For parameters selecting from a fixed set of values, enforce the constraint at the JSON Schema level in the template:

parameters:
  - title: Deployment Target
    properties:
      environment:
        type: string
        enum:
          - development
          - staging

Limiting Scaffolder Permissions

Map every registered action to the permissions it requires and scope the scaffolder service account accordingly. For cloud provisioning actions, use separate credentials per environment with permission boundaries:

scaffolder:
  defaultAuthor:
    name: Scaffolder
    email: scaffolder@example.com

For Kubernetes-targeting actions, use a service account with a ClusterRole limited to the resource types the scaffolder actually creates — not cluster-admin or wildcard resource permissions. Use separate scaffolder credentials for development and production environments; environment selection should be enforced by credential scope, not template logic.

Sandboxing Template Actions

Custom actions run in the same Node.js process as the Backstage backend. They have access to all environment variables, including secrets injected for other plugins. Isolating custom actions from the main process requires running them in a subprocess or a sandboxed executor.

For actions that run arbitrary scripts, use a container-based execution environment rather than executing directly in the scaffolder process. The SLSA build provenance model applies directly: scaffolder actions that provision infrastructure should be treated as build steps with provenance requirements, not as trusted code in a privileged process. Containers provide filesystem and network isolation that the in-process model cannot.

Secret Exposure in Backstage

Secrets in Catalog Metadata

The catalog is designed to store metadata, not secrets. Engineers frequently add secret references, API keys, and credential identifiers to catalog-info.yaml annotations:

metadata:
  annotations:
    vault.io/secret-path: secret/data/production/myservice
    aws.amazon.com/iam-role: arn:aws:iam::123456789:role/prod-myservice

These are metadata references, not secrets — but they expose your secret store layout and IAM structure to any authenticated Backstage user. The catalog API is not a secret store. Treat catalog metadata as information disclosure surface and audit what annotations your entity files contain.

Plugin configuration in app-config.yaml frequently contains service credentials:

integrations:
  github:
    - host: github.com
      token: ${GITHUB_TOKEN}
  vault:
    token: ${VAULT_TOKEN}
    addr: https://vault.example.com

These values are injected at runtime from environment variables, but the structure of app-config.yaml is exposed through the /api/app/config endpoint in default Backstage configurations. Audit which config keys are exposed to the frontend through this endpoint — backend credentials should not be reachable via any frontend-facing API.

TechDocs Rendering Pipeline

TechDocs builds documentation using MkDocs with configuration from the source repository. The mkdocs.yml file in the source repository can specify MkDocs plugins. If a malicious or compromised repository includes a mkdocs.yml that loads a rogue plugin package, that package executes during the TechDocs build.

Isolate the TechDocs build pipeline from the Backstage backend. Run builds in a container with a fixed set of allowed MkDocs plugins and do not permit the source repository’s mkdocs.yml to install arbitrary packages:

techdocs:
  builder: external
  publisher:
    type: awsS3
    awsS3:
      bucketName: my-techdocs-bucket
      region: eu-west-1

Using techdocs:builder: external pushes TechDocs builds into your CI system where you control the build environment. Pre-built output is stored in object storage and served statically, removing the in-process build attack surface.

RBAC in Backstage

Permission Framework

Without an explicit permission policy, Backstage defaults to allowing all authenticated users to perform all actions. Enable the permission framework and define an explicit deny-by-default policy:

import { createBackend } from '@backstage/backend-defaults';
import { BackstageIdentityResponse } from '@backstage/plugin-auth-node';
import {
  AuthorizeResult,
  PolicyDecision,
  isPermission,
} from '@backstage/plugin-permission-common';
import {
  PermissionPolicy,
  PolicyQuery,
} from '@backstage/plugin-permission-node';
import { catalogEntityDeletePermission } from '@backstage/plugin-catalog-common/alpha';

class DefaultDenyPolicy implements PermissionPolicy {
  async handle(
    request: PolicyQuery,
    user?: BackstageIdentityResponse,
  ): Promise<PolicyDecision> {
    if (isPermission(request.permission, catalogEntityDeletePermission)) {
      const userGroups = user?.identity.ownershipEntityRefs ?? [];
      if (userGroups.includes('group:default/platform-engineers')) {
        return { result: AuthorizeResult.ALLOW };
      }
      return { result: AuthorizeResult.DENY };
    }
    return { result: AuthorizeResult.ALLOW };
  }
}

Least-Privilege Role Design

Define roles based on least privilege, informed by zero-trust architecture principles. The typical IDP has three actor categories:

Developer — reads owned catalog entities, invokes developer-scoped scaffolder templates, cannot delete entities or register catalog locations.

Platform engineer — reads all catalog entities, invokes all templates, registers and deletes locations, manages Group and User entities.

Security engineer — read-only access to all catalog entities and scaffolder history, no template invocation or catalog mutation.

Map these roles to Backstage group membership and enforce in the permission policy. The permission framework supports conditional decisions based on entity ownership, enabling policies like “a user can delete an entity only if they own it.”

Audit Logging

What to Log

Backstage does not produce security-relevant audit logs in its default configuration. The Backstage audit log plugin (@backstage/plugin-audit-log-node) provides structured audit events that should be forwarded to your centralized log management system.

Critical events to capture:

  • Scaffolder task creation: actor, template, parameters, outcome
  • Catalog entity mutations: creation, modification, deletion with actor identity
  • Catalog location registration: actor and target URL
  • Authentication events: sign-in, sign-out, token refresh with source IP
  • Permission denials: resource and action attempted
  • Plugin API calls to high-value backends: Vault reads, cloud provider calls made through the Backstage proxy

Scaffolder Task Auditing

The scaffolder records task state in its database, but this is not a security audit log — it does not capture the full parameter set, it can be purged, and it is not integrity-protected. Forward scaffolder events to your SIEM with enough context to answer: who provisioned this resource, when, using which template, and with which parameters.

scaffolderPlugin.addTaskEventSubscriber({
  async onEvent({ taskId, body }) {
    logger.info('scaffolder_task_event', {
      taskId,
      type: body.type,
      stepId: body.stepId,
      status: body.status,
    });
  },
});

Log task completion events alongside the template name and parameter set (with secrets redacted). If a compromised developer account uses the scaffolder to provision infrastructure, this log is your primary evidence of what was created and when.

Correlating IDP Actions with Downstream Changes

IDP audit logs are most useful when correlated with downstream system events. A scaffolder task that creates an AWS IAM role should produce a CloudTrail event; correlating the Backstage audit event with the CloudTrail record gives you end-to-end attribution. Catalog entity registrations should correlate with source code changes — if a new catalog-info.yaml creates a Location entity pointing at an unexpected external URL, alert on that before the catalog processor fetches and ingests it.

Hardening Checklist

  • Guest mode disabled in all non-local environments
  • OIDC provider configured with signIn.resolvers that reject users without catalog User entities
  • BACKSTAGE_BACKEND_SECRET and all integration credentials injected from a secret manager at runtime, not hardcoded in app-config.yaml
  • Catalog rules configured to restrict allowed entity kinds
  • Catalog location ingestion limited to your organization’s repositories on protected branches
  • catalog-info.yaml validation running in pre-merge CI for all repositories
  • Scaffolder custom actions using parameterized subprocess calls, not string interpolation into shell commands
  • Scaffolder service accounts scoped to minimum required cloud/Kubernetes permissions, separate per environment
  • TechDocs using external build mode with controlled build environment
  • Permission policy explicitly defined with least-privilege defaults
  • Backend plugin routes audited for unauthenticated endpoint exposure
  • Audit events forwarded to SIEM with scaffolder task parameters, catalog mutations, and permission denials
  • npm dependencies in the Backstage backend reviewed and locked with a lockfile; plugins from unknown publishers blocked via npm policy