AWS IRSA: IAM Roles for Service Accounts and OIDC Workload Identity

The Static Credentials Problem

Every Kubernetes workload that talks to AWS needs credentials. The naive approach is to create an IAM user, generate an access key pair, and store those keys in a Kubernetes Secret. Every production cluster has a few of these, often created under time pressure and never rotated.

The problems compound quickly. Static access keys do not expire. When a key leaks — through a misconfigured container image, a verbose log line, or a compromised developer machine — the window of exposure is from creation to the moment someone manually rotates it, which is typically measured in months or years. IAM users created for workloads tend to accumulate permissions over time because removing permissions breaks things and adding permissions fixes things, and no one has time to audit what a workload actually uses. The blast radius of a single leaked key is bounded only by the permissions attached to that user, which are rarely least-privilege.

The operational burden is real too. Rotation requires coordinating a secret update across potentially many pods, restarting deployments to pick up the new value, and ensuring no downtime during the transition. Secrets must be backed up somewhere, which creates additional exposure surfaces. And the Kubernetes Secret holding the access key is base64-encoded, not encrypted at rest unless the cluster has been explicitly configured with envelope encryption — a configuration that many clusters lack.

IRSA eliminates static keys entirely. Pods receive short-lived STS credentials scoped to a specific IAM role, obtained automatically by the AWS SDK, bound to a specific Kubernetes service account in a specific namespace. The credentials are valid for one hour by default and rotate continuously without any human intervention.

How OIDC Federation Works

EKS runs an OpenID Connect (OIDC) identity provider embedded in the cluster’s API server. When you enable the OIDC provider for your cluster, EKS publishes a discovery document at a well-known URL that describes how to validate tokens the cluster issues. AWS IAM can be configured to trust this provider, which means IAM can verify that a token claiming to be from your cluster was actually signed by your cluster’s key.

The tokens in question are Kubernetes service account tokens projected into pods via a serviceAccountToken volume. These are not the legacy long-lived tokens that Kubernetes used to create automatically. They are short-lived (configurable, default one hour), audience-bound JWTs signed by the cluster’s service account token signing key. The payload contains standard OIDC claims: iss (the cluster’s OIDC issuer URL), sub (the service account identifier in the form system:serviceaccount:<namespace>:<name>), aud (the intended audience, set to sts.amazonaws.com for IRSA), and exp.

When a pod calls an AWS API, the AWS SDK detects the projected token file at AWS_WEB_IDENTITY_TOKEN_FILE and the role ARN at AWS_ROLE_ARN, both injected by the EKS pod mutating webhook. The SDK calls sts:AssumeRoleWithWebIdentity, passing the token and the role ARN. STS forwards the token to AWS IAM, which validates it against the OIDC provider configuration registered for your cluster. If the token is valid and the IAM role’s trust policy permits the token’s sub claim to assume the role, STS returns temporary credentials with a one-hour TTL. The SDK caches those credentials and refreshes them before expiry. From the application’s perspective, credentials are always available; no rotation code is required.

This is the same OIDC federation mechanism used by other cloud workload identity systems. If you are running workloads across clouds or using a SPIFFE-based identity layer, see SPIFFE and SPIRE for Workload Identity for how cross-platform identity federation compares to cloud-native approaches.

Configuring the EKS OIDC Provider

New EKS clusters do not have the OIDC provider registered with IAM by default. You must create the association explicitly.

With eksctl:

eksctl utils associate-iam-oidc-provider \
  --cluster my-cluster \
  --region us-east-1 \
  --approve

With the AWS CLI directly, first retrieve the cluster’s OIDC issuer URL:

aws eks describe-cluster \
  --name my-cluster \
  --region us-east-1 \
  --query "cluster.identity.oidc.issuer" \
  --output text

This returns something like https://oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE. Extract the hostname, fetch the TLS thumbprint, then register the provider:

ISSUER_URL=$(aws eks describe-cluster \
  --name my-cluster \
  --region us-east-1 \
  --query "cluster.identity.oidc.issuer" \
  --output text)

THUMBPRINT=$(echo | openssl s_client -connect oidc.eks.us-east-1.amazonaws.com:443 \
  -servername oidc.eks.us-east-1.amazonaws.com 2>/dev/null \
  | openssl x509 -fingerprint -sha1 -noout \
  | sed 's/://g' \
  | awk -F= '{print tolower($2)}')

aws iam create-open-id-connect-provider \
  --url "$ISSUER_URL" \
  --client-id-list sts.amazonaws.com \
  --thumbprint-list "$THUMBPRINT"

With Terraform:

data "tls_certificate" "eks" {
  url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}

resource "aws_iam_openid_connect_provider" "eks" {
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = [data.tls_certificate.eks.certificates[0].sha1_fingerprint]
  url             = aws_eks_cluster.main.identity[0].oidc[0].issuer
}

The thumbprint is the SHA-1 fingerprint of the root CA certificate that signed the OIDC endpoint’s TLS certificate. AWS uses this to validate that the OIDC provider endpoint is legitimate. The thumbprint is a source of operational maintenance burden — if AWS rotates the certificate authority behind the EKS OIDC endpoint, the thumbprint must be updated. EKS Pod Identity (covered below) eliminates this requirement.

Creating the IAM Role and Trust Policy

The IAM role’s trust policy is where the binding between a specific Kubernetes service account and an AWS IAM role is established. The trust policy must be precise — an overly broad condition allows any service account in the cluster, or even any service account across clusters sharing the same OIDC provider, to assume the role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:production:s3-reader",
          "oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:aud": "sts.amazonaws.com"
        }
      }
    }
  ]
}

The sub condition pins the role assumption to the service account named s3-reader in the production namespace. A pod in the staging namespace using a service account also named s3-reader will produce a token with sub: system:serviceaccount:staging:s3-reader, which does not match the condition, and STS will reject the assumption request.

Always include the aud condition. Without it, any OIDC token issued by your cluster — regardless of its intended audience — could be used to attempt role assumption. The StringEquals operator performs exact matching; use StringLike only if you intentionally want to allow wildcards (for example, a shared role across multiple namespaces), and document that decision explicitly.

Create the role with the eksctl helper, which constructs the trust policy from the cluster’s OIDC issuer automatically:

eksctl create iamserviceaccount \
  --cluster my-cluster \
  --namespace production \
  --name s3-reader \
  --attach-policy-arn arn:aws:iam::123456789012:policy/S3ReadProductionBucket \
  --approve \
  --region us-east-1

This creates both the IAM role (with the correctly scoped trust policy) and the Kubernetes ServiceAccount with the role annotation applied. If you manage IAM out-of-band, create the role manually and annotate the ServiceAccount separately.

Annotating the ServiceAccount

The EKS pod mutating webhook reads a specific annotation on the ServiceAccount to inject the required environment variables into pods. Without the annotation, no injection occurs and the pod will not have access to the projected token or role ARN.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: s3-reader
  namespace: production
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/production-s3-reader

Optionally, control the token expiry duration (minimum 3600 seconds, maximum 86400 seconds):

  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/production-s3-reader
    eks.amazonaws.com/token-expiration: "3600"

The pod does not need any additional configuration. Any pod that references this ServiceAccount will receive AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE automatically via the mutating webhook. The AWS SDK for all supported languages (Python, Go, Java, Node.js, .NET) checks for these environment variables at startup and activates the web identity credential provider.

Verify the injection after deploying a pod:

kubectl exec -n production deploy/my-app -- env | grep -E 'AWS_ROLE_ARN|AWS_WEB_IDENTITY_TOKEN_FILE'

Verify the token can be exchanged for credentials:

kubectl exec -n production deploy/my-app -- \
  aws sts get-caller-identity

The returned ARN will be the assumed-role ARN, not an IAM user ARN, confirming IRSA is functioning.

EKS Pod Identity: The Newer Alternative

EKS Pod Identity, released in late 2023, is the successor to IRSA. It addresses several operational pain points without changing the security model for the pod itself.

The mechanism differs at the infrastructure level. Instead of federating through an OIDC provider registered in IAM, EKS Pod Identity uses a dedicated EKS-managed credential broker. The eks-pod-identity-agent DaemonSet runs on each node and serves credentials to pods over a node-local endpoint. The pod mutates to include AWS_CONTAINER_CREDENTIALS_FULL_URI and AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE, pointing at the agent, rather than OIDC token variables.

Install the Pod Identity Agent addon:

aws eks create-addon \
  --cluster-name my-cluster \
  --addon-name eks-pod-identity-agent \
  --region us-east-1

Create a Pod Identity association:

aws eks create-pod-identity-association \
  --cluster-name my-cluster \
  --namespace production \
  --service-account s3-reader \
  --role-arn arn:aws:iam::123456789012:role/production-s3-reader \
  --region us-east-1

The IAM role trust policy for Pod Identity uses a different principal:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "pods.eks.amazonaws.com"
      },
      "Action": [
        "sts:AssumeRole",
        "sts:TagSession"
      ]
    }
  ]
}

Pod Identity does not require an OIDC provider registration, thumbprint maintenance, or per-cluster configuration in IAM. The namespace and service account binding is enforced by the Pod Identity association stored in EKS, not in the IAM trust policy. This means the trust policy is reusable across clusters without modification. For new clusters, prefer Pod Identity unless you need to support an SDK version or tooling that does not yet recognise the container credentials URI.

The ServiceAccount does not require any annotation for Pod Identity. The association stored in EKS by create-pod-identity-association is sufficient.

Least-Privilege IAM Scoping

IRSA and Pod Identity solve the credential delivery problem but do not automatically produce least-privilege IAM policies. That work still needs to be done, and it matters: if a pod is compromised, the attacker inherits whatever the IAM role can do.

S3: Prefix-Scoped Access

A workload that reads application configuration from a specific S3 prefix should not have permission to read the entire bucket, and certainly not to write or delete.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ListBucketPrefix",
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::my-app-config",
      "Condition": {
        "StringLike": {
          "s3:prefix": "production/my-app/*"
        }
      }
    },
    {
      "Sid": "GetObjects",
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-app-config/production/my-app/*"
    }
  ]
}

No s3:PutObject, no s3:DeleteObject, no s3:*. If the workload only reads, the policy only allows reads. The prefix condition on ListBucket prevents the workload from enumerating objects it should not know exist.

DynamoDB: Table-Level Scoping

A workload that owns a single DynamoDB table should be restricted to that table’s ARN, including its indexes.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "TableAccess",
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:UpdateItem",
        "dynamodb:DeleteItem",
        "dynamodb:Query",
        "dynamodb:BatchGetItem",
        "dynamodb:BatchWriteItem"
      ],
      "Resource": [
        "arn:aws:dynamodb:us-east-1:123456789012:table/production-orders",
        "arn:aws:dynamodb:us-east-1:123456789012:table/production-orders/index/*"
      ]
    }
  ]
}

dynamodb:Scan is excluded deliberately — full table scans are expensive and rarely required by application logic. If a workload claims to need Scan, that claim should be verified before the permission is added. dynamodb:DescribeTable and dynamodb:CreateTable are also excluded; the application should not be managing its own table schema at runtime.

Using IAM Access Analyzer

Run IAM Access Analyzer’s unused access analysis against IRSA roles after a workload has been running in production for 30 days. Access Analyzer tracks which IAM actions were actually called and flags permissions that were never used. This produces a concrete, data-driven least-privilege recommendation rather than requiring manual analysis of application code.

aws accessanalyzer create-analyzer \
  --analyzer-name eks-workload-analyzer \
  --type ACCOUNT_UNUSED_ACCESS \
  --configuration '{"unusedAccess": {"unusedAccessAge": 30}}'

Auditing and Detecting Abuse

Every sts:AssumeRoleWithWebIdentity call is recorded in CloudTrail. The event contains the role ARN, the source IP, the federated identity (the sub claim from the JWT), and whether the assumption succeeded or failed. This is the primary audit mechanism for IRSA.

Key fields in the CloudTrail event:

requestParameters.roleArn: the role being assumed.
requestParameters.webIdentityToken: not logged in full, but the token’s claims are decoded into additionalEventData.
userIdentity.sessionContext.webIdFederationData.federatedIdentity: the full sub claim, e.g. system:serviceaccount:production:s3-reader.
sourceIPAddress: the node IP or a VPC endpoint IP.

An Athena query to identify all distinct service accounts that assumed a given role over the past 7 days:

SELECT
  useridentity.sessioncontext.webidFederationData.federatedIdentity AS service_account,
  sourceipaddress,
  COUNT(*) AS assumption_count
FROM cloudtrail_logs
WHERE
  eventname = 'AssumeRoleWithWebIdentity'
  AND requestparameters LIKE '%production-s3-reader%'
  AND eventtime >= DATE_FORMAT(DATE_ADD('day', -7, NOW()), '%Y-%m-%dT%H:%i:%SZ')
GROUP BY 1, 2
ORDER BY assumption_count DESC;

Alerts to configure:

Cross-namespace assumption attempts: If the IRSA trust policy is correctly scoped, assumption attempts from any service account other than the expected one will fail. A CloudTrail event with errorCode: AccessDenied and errorMessage containing AssumeRoleWithWebIdentity for an IRSA role is a signal worth alerting on. It means a pod with a different service account attempted to assume the role — either a misconfiguration or an active attempt to escalate privileges using a stolen token.

Assumption from unexpected source IPs: IRSA assumptions originate from node IPs within the cluster’s VPC. An assumption from an IP outside the expected CIDR ranges indicates a token was exfiltrated and used from an external environment. Filter CloudTrail AssumeRoleWithWebIdentity events by sourceIPAddress and alert on any that fall outside VPC CIDR ranges.

Volume anomalies: A workload that normally assumes its role a few hundred times per day suddenly assuming it thousands of times per hour may indicate a control loop bug, an attacker attempting to refresh credentials repeatedly, or a pod crash-looping with an IAM-dependent startup check. Set a CloudWatch metric filter on the count of AssumeRoleWithWebIdentity events per role ARN and alarm on deviation.

For organisations running a zero trust architecture, IRSA assumptions should feed into the broader identity and access telemetry pipeline. The service account identity from the sub claim maps directly to a workload identity and can be correlated with network flow logs and application-level audit events to reconstruct the full context of a credential use.

Common Mistakes

One role per workload type, not per environment: A single IAM role used by both the production and staging deployments of a service means a staging pod compromise grants production-level AWS access. Each environment gets its own role with its own scoped trust policy.

Missing the aud condition: Omitting the audience condition from the trust policy means any JWT issued by the cluster’s OIDC provider — regardless of the intended audience — can be used to attempt role assumption. Always include StringEquals on the aud claim set to sts.amazonaws.com.

Annotating the Deployment instead of the ServiceAccount: The annotation must be on the ServiceAccount, not the Pod or Deployment spec. The mutating webhook reads ServiceAccount annotations. A Deployment annotation is silently ignored.

Using StringLike with * on sub: A trust policy with "sub": "system:serviceaccount:production:*" allows any service account in the production namespace to assume the role. This is rarely intended and significantly broadens the blast radius if any pod in that namespace is compromised.

Not enabling EKS OIDC provider before creating service accounts: If pods are deployed before the OIDC provider is registered, the mutating webhook may not inject environment variables. Restart the relevant pods after registering the provider if this occurs.

Relying on IRSA for non-EKS workloads: IRSA is EKS-specific. For EC2, ECS, Lambda, or self-managed Kubernetes, different mechanisms apply. ECS tasks use task IAM roles; EC2 instances use instance profiles; Lambda uses execution roles. Mixing these up — for example, trying to use IRSA trust policies for non-EKS OIDC providers — requires careful verification of the issuer URL and thumbprint configuration.