Private Access Integration

The Private Access integration lets Autoheal run kubectl commands against EKS clusters that have no public API endpoint. It works by creating an SSM port-forwarding tunnel from the Autoheal sandbox through a bastion EC2 instance to the EKS API server — no VPN, no SSH, no public exposure needed.

How It Works

Autoheal launches a background process inside the sandbox using the credentials from your AWS integration
For each configured EKS cluster, it opens an SSM port-forwarding tunnel: sandbox → bastion EC2 → EKS API server:443
It generates a kubeconfig that routes kubectl traffic through the local tunnel endpoint
kubectl commands in the sandbox transparently reach your private cluster

The bastion never needs a public IP or SSH access — AWS SSM manages the connection entirely.

Architecture

Autoheal Sandbox
  │
  │  SSM port-forward session (encrypted, outbound-only)
  │
  ▼
Bastion EC2  (in your VPC — no inbound ports, no public IP)
  │
  │  TCP to private EKS endpoint
  │
  ▼
EKS API Server (private)

Credentials are short-lived, obtained via OIDC federation through your AWS integration — no long-lived keys stored anywhere.
Tunnels auto-restart on failure; kubectl traffic is transparently rerouted.
Each cluster gets its own local port (10100, 10101, …) with a matching kubeconfig context (pa-<cluster-name>).

What You Need to Prepare

Before configuring the integration in Autoheal, make sure you have:

An AWS integration in Autoheal — configured with OIDC federation (set one up if you haven't already)
A bastion EC2 instance — t3.micro or similar running Amazon Linux 2023, in a subnet that can reach your EKS API server. Attach the AmazonSSMManagedInstanceCore managed policy via an instance profile. No inbound ports or public IP needed.
IAM permissions on the Autoheal role — ssm:StartSession / ssm:TerminateSession scoped to the bastion, plus eks:DescribeCluster and eks:ListClusters
EKS cluster access for the Autoheal role — via EKS Access Entries (recommended) or the aws-auth ConfigMap. Read-only (AmazonEKSViewPolicy) is sufficient.
IAM administrative credentials — needed to run the one-time setup commands in this guide (AWS CLI access to create roles, security groups, and EKS access entries)

Set Your Variables

Run this block once at the start of a terminal session. Every command in this guide reuses these variables.

macOS / Linux
Windows (PowerShell)

# Your 12-digit AWS account ID — find it in the AWS Console top-right menu,
# or run: aws sts get-caller-identity --query Account --output text
export AWS_ACCOUNT_ID="123456789012"

# AWS region where your EKS clusters and bastion will live (e.g. us-east-1, eu-west-1)
export AWS_REGION="us-east-1"

# VPC that contains your private EKS cluster — the bastion must be in this VPC
# (or a peered VPC) so it can reach the EKS API server
export VPC_ID="vpc-0abc1234"

# A private subnet inside that VPC for the bastion EC2 instance.
# Choose a subnet with a route to the internet (NAT Gateway) or SSM VPC endpoints
# so the SSM agent can register. The subnet does NOT need a public IP.
export SUBNET_ID="subnet-0abc1234"

# The name of the private EKS cluster you want Autoheal to access.
# If you have multiple clusters, you will repeat the relevant steps for each.
export EKS_CLUSTER_NAME="my-cluster"

# Your 12-digit AWS account ID — find it in the AWS Console top-right menu,
# or run: aws sts get-caller-identity --query Account --output text
$env:AWS_ACCOUNT_ID = "123456789012"

# AWS region where your EKS clusters and bastion will live (e.g. us-east-1, eu-west-1)
$env:AWS_REGION = "us-east-1"

# VPC that contains your private EKS cluster — the bastion must be in this VPC
# (or a peered VPC) so it can reach the EKS API server
$env:VPC_ID = "vpc-0abc1234"

# A private subnet inside that VPC for the bastion EC2 instance.
# Choose a subnet with a route to the internet (NAT Gateway) or SSM VPC endpoints
# so the SSM agent can register. The subnet does NOT need a public IP.
$env:SUBNET_ID = "subnet-0abc1234"

# The name of the private EKS cluster you want Autoheal to access.
# If you have multiple clusters, you will repeat the relevant steps for each.
$env:EKS_CLUSTER_NAME = "my-cluster"

info

Find your VPC and subnet IDs in the VPC Console or run:

macOS / Linux
Windows (PowerShell)

aws ec2 describe-vpcs --query 'Vpcs[*].[VpcId,Tags[?Key==`Name`].Value|[0],CidrBlock]' --output table --region "${AWS_REGION}"
aws ec2 describe-subnets --filters "Name=vpc-id,Values=${VPC_ID}" --query 'Subnets[*].[SubnetId,AvailabilityZone,CidrBlock,Tags[?Key==`Name`].Value|[0]]' --output table --region "${AWS_REGION}"

aws ec2 describe-vpcs --query 'Vpcs[*].[VpcId,Tags[?Key==`Name`].Value|[0],CidrBlock]' --output table --region $env:AWS_REGION
aws ec2 describe-subnets --filters "Name=vpc-id,Values=$($env:VPC_ID)" --query 'Subnets[*].[SubnetId,AvailabilityZone,CidrBlock,Tags[?Key==`Name`].Value|[0]]' --output table --region $env:AWS_REGION

Step 1 — Create the Bastion IAM Role

The bastion EC2 instance needs an IAM instance profile with the AmazonSSMManagedInstanceCore policy so the SSM agent can register and accept sessions.

macOS / Linux
Windows (PowerShell)

# Create the IAM role for EC2
aws iam create-role \
  --role-name AutohealBastionRole \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": { "Service": "ec2.amazonaws.com" },
      "Action": "sts:AssumeRole"
    }]
  }'

# Attach the SSM managed policy
aws iam attach-role-policy \
  --role-name AutohealBastionRole \
  --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore

# Create the instance profile and attach the role
aws iam create-instance-profile \
  --instance-profile-name AutohealBastionProfile

aws iam add-role-to-instance-profile \
  --instance-profile-name AutohealBastionProfile \
  --role-name AutohealBastionRole

echo "✅ Bastion IAM role and instance profile created"

# Create the IAM role for EC2
aws iam create-role `
  --role-name AutohealBastionRole `
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": { "Service": "ec2.amazonaws.com" },
      "Action": "sts:AssumeRole"
    }]
  }'

# Attach the SSM managed policy
aws iam attach-role-policy `
  --role-name AutohealBastionRole `
  --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore

# Create the instance profile and attach the role
aws iam create-instance-profile `
  --instance-profile-name AutohealBastionProfile

aws iam add-role-to-instance-profile `
  --instance-profile-name AutohealBastionProfile `
  --role-name AutohealBastionRole

Write-Host "✅ Bastion IAM role and instance profile created"

Step 2 — Create the Bastion Security Group

Create the security group before launching the instance. The bastion needs no inbound rules — SSM connects outbound. It needs outbound port 443 to reach both AWS SSM endpoints and the EKS API server.

macOS / Linux
Windows (PowerShell)

# Create the security group
BASTION_SG=$(aws ec2 create-security-group \
  --group-name autoheal-bastion-sg \
  --description "Autoheal bastion - SSM port forwarding to EKS" \
  --vpc-id "${VPC_ID}" \
  --region "${AWS_REGION}" \
  --query GroupId --output text)

echo "Bastion SG: ${BASTION_SG}"

# Remove the default outbound-all rule (optional, for tighter control)
aws ec2 revoke-security-group-egress \
  --group-id "${BASTION_SG}" \
  --protocol -1 \
  --cidr 0.0.0.0/0 \
  --region "${AWS_REGION}" 2>/dev/null || true

# Allow outbound HTTPS — required for SSM endpoints and EKS API
aws ec2 authorize-security-group-egress \
  --group-id "${BASTION_SG}" \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0 \
  --region "${AWS_REGION}"

echo "✅ Security group ${BASTION_SG} created with outbound 443"

# Create the security group
$BASTION_SG = aws ec2 create-security-group `
  --group-name autoheal-bastion-sg `
  --description "Autoheal bastion - SSM port forwarding to EKS" `
  --vpc-id $env:VPC_ID `
  --region $env:AWS_REGION `
  --query GroupId --output text

Write-Host "Bastion SG: $BASTION_SG"

# Remove the default outbound-all rule (optional, for tighter control)
aws ec2 revoke-security-group-egress `
  --group-id $BASTION_SG `
  --protocol -1 `
  --cidr 0.0.0.0/0 `
  --region $env:AWS_REGION 2>$null

# Allow outbound HTTPS — required for SSM endpoints and EKS API
aws ec2 authorize-security-group-egress `
  --group-id $BASTION_SG `
  --protocol tcp `
  --port 443 `
  --cidr 0.0.0.0/0 `
  --region $env:AWS_REGION

Write-Host "✅ Security group $BASTION_SG created with outbound 443"

info

No inbound rules are added. AWS SSM uses outbound HTTPS connections initiated by the SSM agent — there is no inbound attack surface.

Step 3 — Launch the Bastion EC2 Instance

Use Amazon Linux 2023 — the SSM agent is pre-installed and starts automatically on boot. Place the instance in the same VPC as your EKS cluster, with no public IP.

macOS / Linux
Windows (PowerShell)

# Look up the latest Amazon Linux 2023 AMI in your region
AMI_ID=$(aws ec2 describe-images \
  --owners amazon \
  --filters "Name=name,Values=al2023-ami-*-x86_64" \
            "Name=state,Values=available" \
  --query 'sort_by(Images, &CreationDate)[-1].ImageId' \
  --output text \
  --region "${AWS_REGION}")

echo "Using AMI: ${AMI_ID}"

# Launch the instance
BASTION_INSTANCE_ID=$(aws ec2 run-instances \
  --image-id "${AMI_ID}" \
  --instance-type t3.micro \
  --iam-instance-profile Name=AutohealBastionProfile \
  --security-group-ids "${BASTION_SG}" \
  --subnet-id "${SUBNET_ID}" \
  --no-associate-public-ip-address \
  --metadata-options HttpTokens=required \
  --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=autoheal-bastion}]" \
  --region "${AWS_REGION}" \
  --query 'Instances[0].InstanceId' \
  --output text)

echo "Bastion instance ID: ${BASTION_INSTANCE_ID}"

# Look up the latest Amazon Linux 2023 AMI in your region
$AMI_ID = aws ec2 describe-images `
  --owners amazon `
  --filters "Name=name,Values=al2023-ami-*-x86_64" "Name=state,Values=available" `
  --query 'sort_by(Images, &CreationDate)[-1].ImageId' `
  --output text `
  --region $env:AWS_REGION

Write-Host "Using AMI: $AMI_ID"

# Launch the instance
$BASTION_INSTANCE_ID = aws ec2 run-instances `
  --image-id $AMI_ID `
  --instance-type t3.micro `
  --iam-instance-profile Name=AutohealBastionProfile `
  --security-group-ids $BASTION_SG `
  --subnet-id $env:SUBNET_ID `
  --no-associate-public-ip-address `
  --metadata-options HttpTokens=required `
  --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=autoheal-bastion}]" `
  --region $env:AWS_REGION `
  --query 'Instances[0].InstanceId' `
  --output text

Write-Host "Bastion instance ID: $BASTION_INSTANCE_ID"

warning

Save the value of BASTION_INSTANCE_ID — you will need it in Steps 4, 5, and when creating the integration in Autoheal. If you close your terminal, retrieve it again with:

macOS / Linux
Windows (PowerShell)

BASTION_INSTANCE_ID=$(aws ec2 describe-instances \
  --filters "Name=tag:Name,Values=autoheal-bastion" "Name=instance-state-name,Values=running" \
  --query 'Reservations[0].Instances[0].InstanceId' \
  --output text --region "${AWS_REGION}")
echo "Bastion instance ID: ${BASTION_INSTANCE_ID}"

$BASTION_INSTANCE_ID = aws ec2 describe-instances `
  --filters "Name=tag:Name,Values=autoheal-bastion" "Name=instance-state-name,Values=running" `
  --query 'Reservations[0].Instances[0].InstanceId' `
  --output text --region $env:AWS_REGION
Write-Host "Bastion instance ID: $BASTION_INSTANCE_ID"

Step 4 — Allow the Bastion to Reach the EKS API Server

The EKS API server has its own security group that controls which sources can connect to port 443. Add the bastion security group as an allowed source.

macOS / Linux
Windows (PowerShell)

# Get the EKS cluster's security group ID
EKS_SG=$(aws eks describe-cluster \
  --name "${EKS_CLUSTER_NAME}" \
  --region "${AWS_REGION}" \
  --query 'cluster.resourcesVpcConfig.clusterSecurityGroupId' \
  --output text)

echo "EKS cluster SG: ${EKS_SG}"

# Allow inbound 443 from bastion SG → EKS cluster SG
aws ec2 authorize-security-group-ingress \
  --group-id "${EKS_SG}" \
  --protocol tcp \
  --port 443 \
  --source-group "${BASTION_SG}" \
  --region "${AWS_REGION}"

echo "✅ Bastion can now reach EKS API server on port 443"

# Get the EKS cluster's security group ID
$EKS_SG = aws eks describe-cluster `
  --name $env:EKS_CLUSTER_NAME `
  --region $env:AWS_REGION `
  --query 'cluster.resourcesVpcConfig.clusterSecurityGroupId' `
  --output text

Write-Host "EKS cluster SG: $EKS_SG"

# Allow inbound 443 from bastion SG → EKS cluster SG
aws ec2 authorize-security-group-ingress `
  --group-id $EKS_SG `
  --protocol tcp `
  --port 443 `
  --source-group $BASTION_SG `
  --region $env:AWS_REGION

Write-Host "✅ Bastion can now reach EKS API server on port 443"

info

Endpoint access modes:

If your cluster has public + private access: the bastion in the same VPC automatically hits the private endpoint. No further configuration needed.
If your cluster has private access only (publicAccess=false): the bastion must be in the same VPC (or a peered VPC). No public access is used.

Check your cluster's endpoint configuration:

macOS / Linux
Windows (PowerShell)

aws eks describe-cluster \
  --name "${EKS_CLUSTER_NAME}" \
  --region "${AWS_REGION}" \
  --query 'cluster.resourcesVpcConfig.{publicAccess:endpointPublicAccess,privateAccess:endpointPrivateAccess}'

aws eks describe-cluster `
  --name $env:EKS_CLUSTER_NAME `
  --region $env:AWS_REGION `
  --query 'cluster.resourcesVpcConfig.{publicAccess:endpointPublicAccess,privateAccess:endpointPrivateAccess}'

Step 5 — Verify SSM Connectivity

Wait 1–2 minutes after launch for the SSM agent to register, then check:

macOS / Linux
Windows (PowerShell)

# Wait for the instance to be running
aws ec2 wait instance-running \
  --instance-ids "${BASTION_INSTANCE_ID}" \
  --region "${AWS_REGION}"

echo "Instance is running. Waiting for SSM agent to register (~60s)..."
sleep 60

# Check SSM status
aws ssm describe-instance-information \
  --filters "Key=InstanceIds,Values=${BASTION_INSTANCE_ID}" \
  --region "${AWS_REGION}" \
  --query 'InstanceInformationList[0].PingStatus' \
  --output text

# Wait for the instance to be running
aws ec2 wait instance-running `
  --instance-ids $BASTION_INSTANCE_ID `
  --region $env:AWS_REGION

Write-Host "Instance is running. Waiting for SSM agent to register (~60s)..."
Start-Sleep -Seconds 60

# Check SSM status
aws ssm describe-instance-information `
  --filters "Key=InstanceIds,Values=$BASTION_INSTANCE_ID" `
  --region $env:AWS_REGION `
  --query 'InstanceInformationList[0].PingStatus' `
  --output text

Expected output: Online

If the output is None or Offline, see Bastion shows Offline in SSM in the troubleshooting section below.

Optional: verify the bastion can reach the EKS endpoint

First, get the EKS endpoint hostname:

macOS / Linux
Windows (PowerShell)

aws eks describe-cluster \
  --name "${EKS_CLUSTER_NAME}" \
  --region "${AWS_REGION}" \
  --query 'cluster.endpoint' \
  --output text
# Example output: https://ABCD1234.gr7.us-east-1.eks.amazonaws.com

aws eks describe-cluster `
  --name $env:EKS_CLUSTER_NAME `
  --region $env:AWS_REGION `
  --query 'cluster.endpoint' `
  --output text
# Example output: https://ABCD1234.gr7.us-east-1.eks.amazonaws.com

Then open a shell on the bastion and test connectivity from there:

macOS / Linux
Windows (PowerShell)

aws ssm start-session \
  --target "${BASTION_INSTANCE_ID}" \
  --region "${AWS_REGION}"

aws ssm start-session `
  --target $BASTION_INSTANCE_ID `
  --region $env:AWS_REGION

Once inside the bastion shell, run (replace with the hostname from above, without https://):

curl -k -o /dev/null -w "HTTP status: %{http_code}\n" https://ABCD1234.gr7.us-east-1.eks.amazonaws.com

Expected: HTTP status: 403 — the EKS API server responds with 403 (unauthenticated), which confirms the bastion can reach it on port 443. If the command hangs or returns a connection error, the security group rule from Step 4 is missing or incorrect.

Step 6 — Add IAM Permissions to the Autoheal Role

Now that you have the bastion instance ID, add SSM and EKS permissions to the Autoheal role that was created during your AWS integration setup.

macOS / Linux
Windows (PowerShell)

aws iam put-role-policy \
  --role-name AutohealReadOnlyRole \
  --policy-name AutohealPrivateAccessPolicy \
  --policy-document "$(cat <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "SSMTunnel",
      "Effect": "Allow",
      "Action": [
        "ssm:StartSession",
        "ssm:TerminateSession"
      ],
      "Resource": [
        "arn:aws:ec2:${AWS_REGION}:${AWS_ACCOUNT_ID}:instance/${BASTION_INSTANCE_ID}",
        "arn:aws:ssm:*:*:document/AWS-StartPortForwardingSessionToRemoteHost"
      ]
    },
    {
      "Sid": "EKSDescribe",
      "Effect": "Allow",
      "Action": [
        "eks:ListClusters",
        "eks:DescribeCluster"
      ],
      "Resource": "*"
    }
  ]
}
POLICY
)"

echo "✅ Private Access IAM policy attached to AutohealReadOnlyRole"

$policy = @"
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "SSMTunnel",
      "Effect": "Allow",
      "Action": [
        "ssm:StartSession",
        "ssm:TerminateSession"
      ],
      "Resource": [
        "arn:aws:ec2:$($env:AWS_REGION):$($env:AWS_ACCOUNT_ID):instance/$BASTION_INSTANCE_ID",
        "arn:aws:ssm:*:*:document/AWS-StartPortForwardingSessionToRemoteHost"
      ]
    },
    {
      "Sid": "EKSDescribe",
      "Effect": "Allow",
      "Action": [
        "eks:ListClusters",
        "eks:DescribeCluster"
      ],
      "Resource": "*"
    }
  ]
}
"@

aws iam put-role-policy `
  --role-name AutohealReadOnlyRole `
  --policy-name AutohealPrivateAccessPolicy `
  --policy-document $policy

Write-Host "✅ Private Access IAM policy attached to AutohealReadOnlyRole"

Verify the policy was added:

macOS / Linux
Windows (PowerShell)

aws iam get-role-policy \
  --role-name AutohealReadOnlyRole \
  --policy-name AutohealPrivateAccessPolicy \
  --query 'PolicyDocument.Statement[*].{Sid:Sid,Actions:Action}' \
  --output table

aws iam get-role-policy `
  --role-name AutohealReadOnlyRole `
  --policy-name AutohealPrivateAccessPolicy `
  --query 'PolicyDocument.Statement[*].{Sid:Sid,Actions:Action}' `
  --output table

info

Both resources in the SSMTunnel statement are required for port-forwarding sessions:

The EC2 instance ARN — allows starting a session on this specific bastion
The SSM document ARN — allows using the AWS-StartPortForwardingSessionToRemoteHost document type

ssm:StartSession on the instance ARN alone is not sufficient and will result in AccessDeniedException.

Step 7 — Authorize the Autoheal Role in the EKS Cluster

kubectl commands running inside the Autoheal sandbox authenticate using the Autoheal IAM role. That role must be authorized to access the Kubernetes API.

Check which authentication mode your cluster uses:

macOS / Linux
Windows (PowerShell)

aws eks describe-cluster \
  --name "${EKS_CLUSTER_NAME}" \
  --region "${AWS_REGION}" \
  --query 'cluster.accessConfig.authenticationMode' \
  --output text

aws eks describe-cluster `
  --name $env:EKS_CLUSTER_NAME `
  --region $env:AWS_REGION `
  --query 'cluster.accessConfig.authenticationMode' `
  --output text

API or API_AND_CONFIG_MAP (recommended)
CONFIG_MAP (legacy)

If the output is API or API_AND_CONFIG_MAP, use EKS Access Entries — the modern approach that requires no kubectl access to configure.

macOS / Linux
Windows (PowerShell)

# Create an access entry for the Autoheal IAM role
aws eks create-access-entry \
  --cluster-name "${EKS_CLUSTER_NAME}" \
  --principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
  --region "${AWS_REGION}"

# Grant read-only access to all cluster resources
aws eks associate-access-policy \
  --cluster-name "${EKS_CLUSTER_NAME}" \
  --principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy \
  --access-scope type=cluster \
  --region "${AWS_REGION}"

echo "✅ Autoheal role authorized in ${EKS_CLUSTER_NAME}"

# Create an access entry for the Autoheal IAM role
aws eks create-access-entry `
  --cluster-name $env:EKS_CLUSTER_NAME `
  --principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
  --region $env:AWS_REGION

# Grant read-only access to all cluster resources
aws eks associate-access-policy `
  --cluster-name $env:EKS_CLUSTER_NAME `
  --principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy `
  --access-scope type=cluster `
  --region $env:AWS_REGION

Write-Host "✅ Autoheal role authorized in $($env:EKS_CLUSTER_NAME)"

Verify the access entry was created:

macOS / Linux
Windows (PowerShell)

aws eks describe-access-entry \
  --cluster-name "${EKS_CLUSTER_NAME}" \
  --principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
  --region "${AWS_REGION}" \
  --query 'accessEntry.{principal:principalArn,type:type,createdAt:createdAt}'

aws eks describe-access-entry `
  --cluster-name $env:EKS_CLUSTER_NAME `
  --principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
  --region $env:AWS_REGION `
  --query 'accessEntry.{principal:principalArn,type:type,createdAt:createdAt}'

info

AmazonEKSViewPolicy grants read-only access to pods, services, deployments, nodes, events, and most other Kubernetes resources. This is sufficient for Autoheal to inspect cluster state during an investigation.

To restrict to specific namespaces instead of the whole cluster:

macOS / Linux
Windows (PowerShell)

aws eks associate-access-policy \
  --cluster-name "${EKS_CLUSTER_NAME}" \
  --principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy \
  --access-scope type=namespace \
  --namespaces default production \
  --region "${AWS_REGION}"

aws eks associate-access-policy `
  --cluster-name $env:EKS_CLUSTER_NAME `
  --principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy `
  --access-scope type=namespace `
  --namespaces default production `
  --region $env:AWS_REGION

If the output is CONFIG_MAP, edit the aws-auth ConfigMap directly. You need kubectl configured with admin access to your cluster.

First, configure kubectl to talk to the cluster:

macOS / Linux
Windows (PowerShell)

aws eks update-kubeconfig \
  --name "${EKS_CLUSTER_NAME}" \
  --region "${AWS_REGION}"

aws eks update-kubeconfig `
  --name $env:EKS_CLUSTER_NAME `
  --region $env:AWS_REGION

Then apply an RBAC binding that maps the Autoheal role to read-only Kubernetes access:

macOS / Linux
Windows (PowerShell)

# Create a ClusterRoleBinding that gives the Autoheal user read-only access
kubectl create clusterrolebinding autoheal-view \
  --clusterrole=view \
  --user=autoheal

# Create a ClusterRoleBinding that gives the Autoheal user read-only access
kubectl create clusterrolebinding autoheal-view `
  --clusterrole=view `
  --user=autoheal

Then add the role mapping to the aws-auth ConfigMap. The safest way is to patch it:

macOS / Linux
Windows (PowerShell)

# Export the current ConfigMap
kubectl get configmap aws-auth -n kube-system -o yaml > /tmp/aws-auth-backup.yaml
cat /tmp/aws-auth-backup.yaml   # review current state

# Add the Autoheal role mapping
kubectl patch configmap aws-auth -n kube-system --type merge -p "
data:
  mapRoles: |
    $(kubectl get configmap aws-auth -n kube-system -o jsonpath='{.data.mapRoles}' | head -c -1)
    - rolearn: arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole
      username: autoheal
      groups: []
"

# Export the current ConfigMap
kubectl get configmap aws-auth -n kube-system -o yaml > $env:TEMP\aws-auth-backup.yaml
Get-Content $env:TEMP\aws-auth-backup.yaml   # review current state

# Add the Autoheal role mapping
$existingRoles = kubectl get configmap aws-auth -n kube-system `
  -o jsonpath='{.data.mapRoles}'
$patch = @"
data:
  mapRoles: |
    $existingRoles
    - rolearn: arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole
      username: autoheal
      groups: []
"@
kubectl patch configmap aws-auth -n kube-system --type merge -p $patch

warning

The patch approach above appends to the existing mapRoles. If the formatting breaks, restore from backup:

macOS / Linux
Windows (PowerShell)

kubectl apply -f /tmp/aws-auth-backup.yaml

kubectl apply -f $env:TEMP\aws-auth-backup.yaml

Alternatively, edit the ConfigMap directly with kubectl edit configmap aws-auth -n kube-system and add the entry manually.

The groups: [] is intentional — the view ClusterRoleBinding above maps the autoheal username directly, not via a group. Do not use system:masters unless you intend to grant full cluster-admin access.

Verify the mapping is in place:

macOS / Linux
Windows (PowerShell)

kubectl get configmap aws-auth -n kube-system -o jsonpath='{.data.mapRoles}' | grep -A3 AutohealReadOnlyRole

kubectl get configmap aws-auth -n kube-system -o jsonpath='{.data.mapRoles}' |
  Select-String -Pattern "AutohealReadOnlyRole" -Context 0,3

Step 8 — Create the Integration in Autoheal

Go to Integrations in Autoheal
Click Private Access → AWS SSM Tunnel
Fill in the fields:
- AWS Integration — select the AWS integration configured with OIDC federation (e.g., "Production AWS")
- Bastion Instance ID — paste the instance ID from Step 3 (e.g., i-0dc03fbcae2a87472)
- AWS Region — the region where your bastion and EKS clusters reside (e.g., us-east-1)
- EKS Clusters — select from the dropdown; Autoheal lists all clusters your AWS integration can see in the specified region
Click Save

The integration will show as active within ~30 seconds once the SSM tunnel is established.

How Traffic Flows

Autoheal Sandbox
  │
  │  kubectl get pods   (hits localhost:10100)
  │
  ▼
localhost:10100 ──── SSM port-forward ────► Bastion EC2 ────► EKS API server :443

Ports are assigned sequentially starting at 10100. For multiple clusters:

First cluster → localhost:10100
Second cluster → localhost:10101

The kubeconfig is written automatically with the correct context names and server addresses.

Troubleshooting

Integration shows 'starting' but never becomes active

The setup process prints tunnel ready once all SSM tunnels are established. If the integration stays in starting for more than 60 seconds:

Check SSM status:

macOS / Linux
Windows (PowerShell)

aws ssm describe-instance-information \
  --filters "Key=InstanceIds,Values=${BASTION_INSTANCE_ID}" \
  --region "${AWS_REGION}" \
  --query 'InstanceInformationList[0].{status:PingStatus,agentVersion:AgentVersion,ip:IPAddress}' \
  --output table

aws ssm describe-instance-information `
  --filters "Key=InstanceIds,Values=$BASTION_INSTANCE_ID" `
  --region $env:AWS_REGION `
  --query 'InstanceInformationList[0].{status:PingStatus,agentVersion:AgentVersion,ip:IPAddress}' `
  --output table

Test that SSM can open a session to the bastion:

macOS / Linux
Windows (PowerShell)

aws ssm start-session \
  --target "${BASTION_INSTANCE_ID}" \
  --region "${AWS_REGION}"

aws ssm start-session `
  --target $BASTION_INSTANCE_ID `
  --region $env:AWS_REGION

If this fails with TargetNotConnected, the SSM agent is not registered — see "Bastion shows Offline" below. Type exit to close the session.

Test bastion → EKS reachability using a port-forward tunnel:

This opens a port-forwarding tunnel through the bastion to the EKS API server, then checks if the API server responds:

macOS / Linux
Windows (PowerShell)

# Get the EKS endpoint hostname
EKS_ENDPOINT=$(aws eks describe-cluster \
  --name "${EKS_CLUSTER_NAME}" \
  --region "${AWS_REGION}" \
  --query 'cluster.endpoint' --output text | sed 's|https://||')

# Open a port-forward tunnel: localhost:19999 → bastion → EKS API:443
SSM_PARAMS='{"host":["'"${EKS_ENDPOINT}"'"],"portNumber":["443"],"localPortNumber":["19999"]}'
aws ssm start-session \
  --target "${BASTION_INSTANCE_ID}" \
  --document-name AWS-StartPortForwardingSessionToRemoteHost \
  --parameters "${SSM_PARAMS}" \
  --region "${AWS_REGION}" &
SSM_PID=$!

sleep 3  # wait for tunnel to establish

# Check if the EKS API server responds through the tunnel
curl -k -o /dev/null -w "HTTP status: %{http_code}\n" https://127.0.0.1:19999

kill "${SSM_PID}"  # close the tunnel

# Get the EKS endpoint hostname
$EKS_ENDPOINT = (aws eks describe-cluster `
  --name $env:EKS_CLUSTER_NAME `
  --region $env:AWS_REGION `
  --query 'cluster.endpoint' --output text) -replace 'https://',''

# Open a port-forward tunnel: localhost:19999 → bastion → EKS API:443
$SSM_PARAMS = '{"host":["' + $EKS_ENDPOINT + '"],"portNumber":["443"],"localPortNumber":["19999"]}'
$ssmProc = Start-Process -NoNewWindow -PassThru aws -ArgumentList `
  "ssm","start-session",
  "--target",$BASTION_INSTANCE_ID,
  "--document-name","AWS-StartPortForwardingSessionToRemoteHost",
  "--parameters",$SSM_PARAMS,
  "--region",$env:AWS_REGION

Start-Sleep -Seconds 3  # wait for tunnel to establish

# Check if the EKS API server responds through the tunnel
curl.exe -k -o NUL -w "HTTP status: %{http_code}\n" https://127.0.0.1:19999

$ssmProc.Kill()  # close the tunnel

Expected: HTTP status: 403 — the EKS API server responds (403 = unauthenticated, which is correct). If the curl hangs or the SSM command fails, the security group rule from Step 4 is missing or incorrect.

kubectl commands fail with 'Unauthorized' or 'Forbidden'

The SSM tunnel is working, but the Autoheal IAM role is not authorized in the EKS cluster.

Check access entries (modern clusters):

macOS / Linux
Windows (PowerShell)

aws eks list-access-entries \
  --cluster-name "${EKS_CLUSTER_NAME}" \
  --region "${AWS_REGION}" \
  --query 'accessEntries' \
  --output table

aws eks list-access-entries `
  --cluster-name $env:EKS_CLUSTER_NAME `
  --region $env:AWS_REGION `
  --query 'accessEntries' `
  --output table

Check aws-auth ConfigMap (legacy clusters):

kubectl get configmap aws-auth -n kube-system -o yaml

Confirm arn:aws:iam::ACCOUNT:role/AutohealReadOnlyRole appears under mapRoles.

Check the RBAC binding:

kubectl get clusterrolebinding autoheal-view -o yaml

Verify the exact role ARN — OIDC-assumed roles sometimes include a session suffix in CloudTrail, but the access entry must match the base role ARN exactly:

macOS / Linux
Windows (PowerShell)

aws iam get-role \
  --role-name AutohealReadOnlyRole \
  --query 'Role.Arn' \
  --output text

aws iam get-role `
  --role-name AutohealReadOnlyRole `
  --query 'Role.Arn' `
  --output text

EKS clusters dropdown is empty

The Autoheal role cannot list EKS clusters.

Verify the IAM policy was attached:

macOS / Linux
Windows (PowerShell)

aws iam get-role-policy \
  --role-name AutohealReadOnlyRole \
  --policy-name AutohealPrivateAccessPolicy \
  --query 'PolicyDocument.Statement[?Sid==`EKSDescribe`].Action' \
  --output table

aws iam get-role-policy `
  --role-name AutohealReadOnlyRole `
  --policy-name AutohealPrivateAccessPolicy `
  --query 'PolicyDocument.Statement[?Sid==`EKSDescribe`].Action' `
  --output table

Test listing clusters using the role's effective permissions:

macOS / Linux
Windows (PowerShell)

aws eks list-clusters --region "${AWS_REGION}" --output table

aws eks list-clusters --region $env:AWS_REGION --output table

If this works from your terminal but the dropdown is empty, verify the region field in the Autoheal integration form matches the region where your clusters are deployed.

Bastion shows Offline in SSM

The SSM agent cannot reach AWS SSM endpoints.

Check the instance profile is attached:

macOS / Linux
Windows (PowerShell)

aws ec2 describe-instances \
  --instance-ids "${BASTION_INSTANCE_ID}" \
  --region "${AWS_REGION}" \
  --query 'Reservations[0].Instances[0].IamInstanceProfile.Arn' \
  --output text

aws ec2 describe-instances `
  --instance-ids $BASTION_INSTANCE_ID `
  --region $env:AWS_REGION `
  --query 'Reservations[0].Instances[0].IamInstanceProfile.Arn' `
  --output text

Expected: an ARN containing AutohealBastionProfile. If None, attach the profile:

macOS / Linux
Windows (PowerShell)

aws ec2 associate-iam-instance-profile \
  --instance-id "${BASTION_INSTANCE_ID}" \
  --iam-instance-profile Name=AutohealBastionProfile \
  --region "${AWS_REGION}"

aws ec2 associate-iam-instance-profile `
  --instance-id $BASTION_INSTANCE_ID `
  --iam-instance-profile Name=AutohealBastionProfile `
  --region $env:AWS_REGION

Check the role is attached to the profile:

macOS / Linux
Windows (PowerShell)

aws iam get-instance-profile \
  --instance-profile-name AutohealBastionProfile \
  --query 'InstanceProfile.Roles[*].RoleName' \
  --output text

aws iam get-instance-profile `
  --instance-profile-name AutohealBastionProfile `
  --query 'InstanceProfile.Roles[*].RoleName' `
  --output text

Check outbound 443 from the security group:

macOS / Linux
Windows (PowerShell)

aws ec2 describe-security-groups \
  --group-ids "${BASTION_SG}" \
  --region "${AWS_REGION}" \
  --query 'SecurityGroups[0].IpPermissionsEgress'

aws ec2 describe-security-groups `
  --group-ids $BASTION_SG `
  --region $env:AWS_REGION `
  --query 'SecurityGroups[0].IpPermissionsEgress'

There should be a rule for TCP port 443 to 0.0.0.0/0. If missing, add it:

macOS / Linux
Windows (PowerShell)

aws ec2 authorize-security-group-egress \
  --group-id "${BASTION_SG}" \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0 \
  --region "${AWS_REGION}"

aws ec2 authorize-security-group-egress `
  --group-id $BASTION_SG `
  --protocol tcp `
  --port 443 `
  --cidr 0.0.0.0/0 `
  --region $env:AWS_REGION

The bastion needs outbound 443 to reach AWS SSM service endpoints. In a private subnet with no NAT Gateway, the SSM agent cannot reach these endpoints. Options:

Add a NAT Gateway to the subnet's route table
Add SSM VPC endpoints (ssm, ssmmessages, ec2messages) to the VPC

'AccessDeniedException' when starting SSM session

The Autoheal role's ssm:StartSession permission is missing or scoped incorrectly.

Verify both required resources are in the policy:

macOS / Linux
Windows (PowerShell)

aws iam get-role-policy \
  --role-name AutohealReadOnlyRole \
  --policy-name AutohealPrivateAccessPolicy \
  --query 'PolicyDocument.Statement[?Sid==`SSMTunnel`].Resource' \
  --output table

aws iam get-role-policy `
  --role-name AutohealReadOnlyRole `
  --policy-name AutohealPrivateAccessPolicy `
  --query 'PolicyDocument.Statement[?Sid==`SSMTunnel`].Resource' `
  --output table

The output must include both:

arn:aws:ec2:REGION:ACCOUNT:instance/INSTANCE_ID
arn:aws:ssm:*:*:document/AWS-StartPortForwardingSessionToRemoteHost

ssm:StartSession on the instance ARN alone is not sufficient for port-forwarding — AWS requires explicit permission on the SSM document as well.

If the instance ID in the policy doesn't match the bastion, re-run the Step 6 command with the correct BASTION_INSTANCE_ID.

Security Notes

No inbound firewall rules needed: The bastion has zero inbound rules. All connectivity is outbound-only, initiated by the SSM agent.
No SSH, no public IP: The bastion does not expose SSH. It has no public IP address.
Temporary credentials only: The Autoheal OIDC role issues short-lived credentials (typically 1 hour) via STS. No long-lived keys are stored anywhere.
Least privilege: The ssm:StartSession resource is scoped to the specific bastion instance ARN. The EKS access entry uses AmazonEKSViewPolicy (read-only), not system:masters.
Audit trail: Every SSM session and kubectl API call appears in AWS CloudTrail under the Autoheal role identity.

How It Works​

Architecture​

What You Need to Prepare​

Set Your Variables​

Step 1 — Create the Bastion IAM Role​

Step 2 — Create the Bastion Security Group​

Step 3 — Launch the Bastion EC2 Instance​

Step 4 — Allow the Bastion to Reach the EKS API Server​

Step 5 — Verify SSM Connectivity​

Step 6 — Add IAM Permissions to the Autoheal Role​

Step 7 — Authorize the Autoheal Role in the EKS Cluster​

Step 8 — Create the Integration in Autoheal​

How Traffic Flows​

Troubleshooting​

Security Notes​

How It Works

Architecture

What You Need to Prepare

Set Your Variables

Step 1 — Create the Bastion IAM Role

Step 2 — Create the Bastion Security Group

Step 3 — Launch the Bastion EC2 Instance

Step 4 — Allow the Bastion to Reach the EKS API Server

Step 5 — Verify SSM Connectivity

Step 6 — Add IAM Permissions to the Autoheal Role

Step 7 — Authorize the Autoheal Role in the EKS Cluster

Step 8 — Create the Integration in Autoheal

How Traffic Flows

Troubleshooting

Security Notes