Private Access Integration
The Private Access integration lets Autoheal run kubectl commands against EKS clusters that have no public API endpoint. It works by creating an SSM port-forwarding tunnel from the Autoheal sandbox through a bastion EC2 instance to the EKS API server — no VPN, no SSH, no public exposure needed.
How It Works
- Autoheal launches a background process inside the sandbox using the credentials from your AWS integration
- For each configured EKS cluster, it opens an SSM port-forwarding tunnel:
sandbox → bastion EC2 → EKS API server:443 - It generates a kubeconfig that routes kubectl traffic through the local tunnel endpoint
kubectlcommands in the sandbox transparently reach your private cluster
The bastion never needs a public IP or SSH access — AWS SSM manages the connection entirely.
Architecture
Autoheal Sandbox
│
│ SSM port-forward session (encrypted, outbound-only)
│
▼
Bastion EC2 (in your VPC — no inbound ports, no public IP)
│
│ TCP to private EKS endpoint
│
▼
EKS API Server (private)
- Credentials are short-lived, obtained via OIDC federation through your AWS integration — no long-lived keys stored anywhere.
- Tunnels auto-restart on failure;
kubectltraffic is transparently rerouted. - Each cluster gets its own local port (
10100,10101, …) with a matching kubeconfig context (pa-<cluster-name>).
What You Need to Prepare
Before configuring the integration in Autoheal, make sure you have:
- An AWS integration in Autoheal — configured with OIDC federation (set one up if you haven't already)
- A bastion EC2 instance —
t3.microor similar running Amazon Linux 2023, in a subnet that can reach your EKS API server. Attach theAmazonSSMManagedInstanceCoremanaged policy via an instance profile. No inbound ports or public IP needed. - IAM permissions on the Autoheal role —
ssm:StartSession/ssm:TerminateSessionscoped to the bastion, pluseks:DescribeClusterandeks:ListClusters - EKS cluster access for the Autoheal role — via EKS Access Entries (recommended) or the
aws-authConfigMap. Read-only (AmazonEKSViewPolicy) is sufficient. - IAM administrative credentials — needed to run the one-time setup commands in this guide (AWS CLI access to create roles, security groups, and EKS access entries)
Set Your Variables
Run this block once at the start of a terminal session. Every command in this guide reuses these variables.
- macOS / Linux
- Windows (PowerShell)
# Your 12-digit AWS account ID — find it in the AWS Console top-right menu,
# or run: aws sts get-caller-identity --query Account --output text
export AWS_ACCOUNT_ID="123456789012"
# AWS region where your EKS clusters and bastion will live (e.g. us-east-1, eu-west-1)
export AWS_REGION="us-east-1"
# VPC that contains your private EKS cluster — the bastion must be in this VPC
# (or a peered VPC) so it can reach the EKS API server
export VPC_ID="vpc-0abc1234"
# A private subnet inside that VPC for the bastion EC2 instance.
# Choose a subnet with a route to the internet (NAT Gateway) or SSM VPC endpoints
# so the SSM agent can register. The subnet does NOT need a public IP.
export SUBNET_ID="subnet-0abc1234"
# The name of the private EKS cluster you want Autoheal to access.
# If you have multiple clusters, you will repeat the relevant steps for each.
export EKS_CLUSTER_NAME="my-cluster"
# Your 12-digit AWS account ID — find it in the AWS Console top-right menu,
# or run: aws sts get-caller-identity --query Account --output text
$env:AWS_ACCOUNT_ID = "123456789012"
# AWS region where your EKS clusters and bastion will live (e.g. us-east-1, eu-west-1)
$env:AWS_REGION = "us-east-1"
# VPC that contains your private EKS cluster — the bastion must be in this VPC
# (or a peered VPC) so it can reach the EKS API server
$env:VPC_ID = "vpc-0abc1234"
# A private subnet inside that VPC for the bastion EC2 instance.
# Choose a subnet with a route to the internet (NAT Gateway) or SSM VPC endpoints
# so the SSM agent can register. The subnet does NOT need a public IP.
$env:SUBNET_ID = "subnet-0abc1234"
# The name of the private EKS cluster you want Autoheal to access.
# If you have multiple clusters, you will repeat the relevant steps for each.
$env:EKS_CLUSTER_NAME = "my-cluster"
Find your VPC and subnet IDs in the VPC Console or run:
- macOS / Linux
- Windows (PowerShell)
aws ec2 describe-vpcs --query 'Vpcs[*].[VpcId,Tags[?Key==`Name`].Value|[0],CidrBlock]' --output table --region "${AWS_REGION}"
aws ec2 describe-subnets --filters "Name=vpc-id,Values=${VPC_ID}" --query 'Subnets[*].[SubnetId,AvailabilityZone,CidrBlock,Tags[?Key==`Name`].Value|[0]]' --output table --region "${AWS_REGION}"
aws ec2 describe-vpcs --query 'Vpcs[*].[VpcId,Tags[?Key==`Name`].Value|[0],CidrBlock]' --output table --region $env:AWS_REGION
aws ec2 describe-subnets --filters "Name=vpc-id,Values=$($env:VPC_ID)" --query 'Subnets[*].[SubnetId,AvailabilityZone,CidrBlock,Tags[?Key==`Name`].Value|[0]]' --output table --region $env:AWS_REGION
Step 1 — Create the Bastion IAM Role
The bastion EC2 instance needs an IAM instance profile with the AmazonSSMManagedInstanceCore policy so the SSM agent can register and accept sessions.
- macOS / Linux
- Windows (PowerShell)
# Create the IAM role for EC2
aws iam create-role \
--role-name AutohealBastionRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": { "Service": "ec2.amazonaws.com" },
"Action": "sts:AssumeRole"
}]
}'
# Attach the SSM managed policy
aws iam attach-role-policy \
--role-name AutohealBastionRole \
--policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
# Create the instance profile and attach the role
aws iam create-instance-profile \
--instance-profile-name AutohealBastionProfile
aws iam add-role-to-instance-profile \
--instance-profile-name AutohealBastionProfile \
--role-name AutohealBastionRole
echo "✅ Bastion IAM role and instance profile created"
# Create the IAM role for EC2
aws iam create-role `
--role-name AutohealBastionRole `
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": { "Service": "ec2.amazonaws.com" },
"Action": "sts:AssumeRole"
}]
}'
# Attach the SSM managed policy
aws iam attach-role-policy `
--role-name AutohealBastionRole `
--policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
# Create the instance profile and attach the role
aws iam create-instance-profile `
--instance-profile-name AutohealBastionProfile
aws iam add-role-to-instance-profile `
--instance-profile-name AutohealBastionProfile `
--role-name AutohealBastionRole
Write-Host "✅ Bastion IAM role and instance profile created"
Step 2 — Create the Bastion Security Group
Create the security group before launching the instance. The bastion needs no inbound rules — SSM connects outbound. It needs outbound port 443 to reach both AWS SSM endpoints and the EKS API server.
- macOS / Linux
- Windows (PowerShell)
# Create the security group
BASTION_SG=$(aws ec2 create-security-group \
--group-name autoheal-bastion-sg \
--description "Autoheal bastion - SSM port forwarding to EKS" \
--vpc-id "${VPC_ID}" \
--region "${AWS_REGION}" \
--query GroupId --output text)
echo "Bastion SG: ${BASTION_SG}"
# Remove the default outbound-all rule (optional, for tighter control)
aws ec2 revoke-security-group-egress \
--group-id "${BASTION_SG}" \
--protocol -1 \
--cidr 0.0.0.0/0 \
--region "${AWS_REGION}" 2>/dev/null || true
# Allow outbound HTTPS — required for SSM endpoints and EKS API
aws ec2 authorize-security-group-egress \
--group-id "${BASTION_SG}" \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0 \
--region "${AWS_REGION}"
echo "✅ Security group ${BASTION_SG} created with outbound 443"
# Create the security group
$BASTION_SG = aws ec2 create-security-group `
--group-name autoheal-bastion-sg `
--description "Autoheal bastion - SSM port forwarding to EKS" `
--vpc-id $env:VPC_ID `
--region $env:AWS_REGION `
--query GroupId --output text
Write-Host "Bastion SG: $BASTION_SG"
# Remove the default outbound-all rule (optional, for tighter control)
aws ec2 revoke-security-group-egress `
--group-id $BASTION_SG `
--protocol -1 `
--cidr 0.0.0.0/0 `
--region $env:AWS_REGION 2>$null
# Allow outbound HTTPS — required for SSM endpoints and EKS API
aws ec2 authorize-security-group-egress `
--group-id $BASTION_SG `
--protocol tcp `
--port 443 `
--cidr 0.0.0.0/0 `
--region $env:AWS_REGION
Write-Host "✅ Security group $BASTION_SG created with outbound 443"
No inbound rules are added. AWS SSM uses outbound HTTPS connections initiated by the SSM agent — there is no inbound attack surface.
Step 3 — Launch the Bastion EC2 Instance
Use Amazon Linux 2023 — the SSM agent is pre-installed and starts automatically on boot. Place the instance in the same VPC as your EKS cluster, with no public IP.
- macOS / Linux
- Windows (PowerShell)
# Look up the latest Amazon Linux 2023 AMI in your region
AMI_ID=$(aws ec2 describe-images \
--owners amazon \
--filters "Name=name,Values=al2023-ami-*-x86_64" \
"Name=state,Values=available" \
--query 'sort_by(Images, &CreationDate)[-1].ImageId' \
--output text \
--region "${AWS_REGION}")
echo "Using AMI: ${AMI_ID}"
# Launch the instance
BASTION_INSTANCE_ID=$(aws ec2 run-instances \
--image-id "${AMI_ID}" \
--instance-type t3.micro \
--iam-instance-profile Name=AutohealBastionProfile \
--security-group-ids "${BASTION_SG}" \
--subnet-id "${SUBNET_ID}" \
--no-associate-public-ip-address \
--metadata-options HttpTokens=required \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=autoheal-bastion}]" \
--region "${AWS_REGION}" \
--query 'Instances[0].InstanceId' \
--output text)
echo "Bastion instance ID: ${BASTION_INSTANCE_ID}"
# Look up the latest Amazon Linux 2023 AMI in your region
$AMI_ID = aws ec2 describe-images `
--owners amazon `
--filters "Name=name,Values=al2023-ami-*-x86_64" "Name=state,Values=available" `
--query 'sort_by(Images, &CreationDate)[-1].ImageId' `
--output text `
--region $env:AWS_REGION
Write-Host "Using AMI: $AMI_ID"
# Launch the instance
$BASTION_INSTANCE_ID = aws ec2 run-instances `
--image-id $AMI_ID `
--instance-type t3.micro `
--iam-instance-profile Name=AutohealBastionProfile `
--security-group-ids $BASTION_SG `
--subnet-id $env:SUBNET_ID `
--no-associate-public-ip-address `
--metadata-options HttpTokens=required `
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=autoheal-bastion}]" `
--region $env:AWS_REGION `
--query 'Instances[0].InstanceId' `
--output text
Write-Host "Bastion instance ID: $BASTION_INSTANCE_ID"
Save the value of BASTION_INSTANCE_ID — you will need it in Steps 4, 5, and when creating the integration in Autoheal. If you close your terminal, retrieve it again with:
- macOS / Linux
- Windows (PowerShell)
BASTION_INSTANCE_ID=$(aws ec2 describe-instances \
--filters "Name=tag:Name,Values=autoheal-bastion" "Name=instance-state-name,Values=running" \
--query 'Reservations[0].Instances[0].InstanceId' \
--output text --region "${AWS_REGION}")
echo "Bastion instance ID: ${BASTION_INSTANCE_ID}"
$BASTION_INSTANCE_ID = aws ec2 describe-instances `
--filters "Name=tag:Name,Values=autoheal-bastion" "Name=instance-state-name,Values=running" `
--query 'Reservations[0].Instances[0].InstanceId' `
--output text --region $env:AWS_REGION
Write-Host "Bastion instance ID: $BASTION_INSTANCE_ID"
Step 4 — Allow the Bastion to Reach the EKS API Server
The EKS API server has its own security group that controls which sources can connect to port 443. Add the bastion security group as an allowed source.
- macOS / Linux
- Windows (PowerShell)
# Get the EKS cluster's security group ID
EKS_SG=$(aws eks describe-cluster \
--name "${EKS_CLUSTER_NAME}" \
--region "${AWS_REGION}" \
--query 'cluster.resourcesVpcConfig.clusterSecurityGroupId' \
--output text)
echo "EKS cluster SG: ${EKS_SG}"
# Allow inbound 443 from bastion SG → EKS cluster SG
aws ec2 authorize-security-group-ingress \
--group-id "${EKS_SG}" \
--protocol tcp \
--port 443 \
--source-group "${BASTION_SG}" \
--region "${AWS_REGION}"
echo "✅ Bastion can now reach EKS API server on port 443"
# Get the EKS cluster's security group ID
$EKS_SG = aws eks describe-cluster `
--name $env:EKS_CLUSTER_NAME `
--region $env:AWS_REGION `
--query 'cluster.resourcesVpcConfig.clusterSecurityGroupId' `
--output text
Write-Host "EKS cluster SG: $EKS_SG"
# Allow inbound 443 from bastion SG → EKS cluster SG
aws ec2 authorize-security-group-ingress `
--group-id $EKS_SG `
--protocol tcp `
--port 443 `
--source-group $BASTION_SG `
--region $env:AWS_REGION
Write-Host "✅ Bastion can now reach EKS API server on port 443"
Endpoint access modes:
- If your cluster has public + private access: the bastion in the same VPC automatically hits the private endpoint. No further configuration needed.
- If your cluster has private access only (
publicAccess=false): the bastion must be in the same VPC (or a peered VPC). No public access is used.
Check your cluster's endpoint configuration:
- macOS / Linux
- Windows (PowerShell)
aws eks describe-cluster \
--name "${EKS_CLUSTER_NAME}" \
--region "${AWS_REGION}" \
--query 'cluster.resourcesVpcConfig.{publicAccess:endpointPublicAccess,privateAccess:endpointPrivateAccess}'
aws eks describe-cluster `
--name $env:EKS_CLUSTER_NAME `
--region $env:AWS_REGION `
--query 'cluster.resourcesVpcConfig.{publicAccess:endpointPublicAccess,privateAccess:endpointPrivateAccess}'
Step 5 — Verify SSM Connectivity
Wait 1–2 minutes after launch for the SSM agent to register, then check:
- macOS / Linux
- Windows (PowerShell)
# Wait for the instance to be running
aws ec2 wait instance-running \
--instance-ids "${BASTION_INSTANCE_ID}" \
--region "${AWS_REGION}"
echo "Instance is running. Waiting for SSM agent to register (~60s)..."
sleep 60
# Check SSM status
aws ssm describe-instance-information \
--filters "Key=InstanceIds,Values=${BASTION_INSTANCE_ID}" \
--region "${AWS_REGION}" \
--query 'InstanceInformationList[0].PingStatus' \
--output text
# Wait for the instance to be running
aws ec2 wait instance-running `
--instance-ids $BASTION_INSTANCE_ID `
--region $env:AWS_REGION
Write-Host "Instance is running. Waiting for SSM agent to register (~60s)..."
Start-Sleep -Seconds 60
# Check SSM status
aws ssm describe-instance-information `
--filters "Key=InstanceIds,Values=$BASTION_INSTANCE_ID" `
--region $env:AWS_REGION `
--query 'InstanceInformationList[0].PingStatus' `
--output text
Expected output: Online
If the output is None or Offline, see Bastion shows Offline in SSM in the troubleshooting section below.
Optional: verify the bastion can reach the EKS endpoint
First, get the EKS endpoint hostname:
- macOS / Linux
- Windows (PowerShell)
aws eks describe-cluster \
--name "${EKS_CLUSTER_NAME}" \
--region "${AWS_REGION}" \
--query 'cluster.endpoint' \
--output text
# Example output: https://ABCD1234.gr7.us-east-1.eks.amazonaws.com
aws eks describe-cluster `
--name $env:EKS_CLUSTER_NAME `
--region $env:AWS_REGION `
--query 'cluster.endpoint' `
--output text
# Example output: https://ABCD1234.gr7.us-east-1.eks.amazonaws.com
Then open a shell on the bastion and test connectivity from there:
- macOS / Linux
- Windows (PowerShell)
aws ssm start-session \
--target "${BASTION_INSTANCE_ID}" \
--region "${AWS_REGION}"
aws ssm start-session `
--target $BASTION_INSTANCE_ID `
--region $env:AWS_REGION
Once inside the bastion shell, run (replace with the hostname from above, without https://):
curl -k -o /dev/null -w "HTTP status: %{http_code}\n" https://ABCD1234.gr7.us-east-1.eks.amazonaws.com
Expected: HTTP status: 403 — the EKS API server responds with 403 (unauthenticated), which confirms the bastion can reach it on port 443. If the command hangs or returns a connection error, the security group rule from Step 4 is missing or incorrect.
Step 6 — Add IAM Permissions to the Autoheal Role
Now that you have the bastion instance ID, add SSM and EKS permissions to the Autoheal role that was created during your AWS integration setup.
- macOS / Linux
- Windows (PowerShell)
aws iam put-role-policy \
--role-name AutohealReadOnlyRole \
--policy-name AutohealPrivateAccessPolicy \
--policy-document "$(cat <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "SSMTunnel",
"Effect": "Allow",
"Action": [
"ssm:StartSession",
"ssm:TerminateSession"
],
"Resource": [
"arn:aws:ec2:${AWS_REGION}:${AWS_ACCOUNT_ID}:instance/${BASTION_INSTANCE_ID}",
"arn:aws:ssm:*:*:document/AWS-StartPortForwardingSessionToRemoteHost"
]
},
{
"Sid": "EKSDescribe",
"Effect": "Allow",
"Action": [
"eks:ListClusters",
"eks:DescribeCluster"
],
"Resource": "*"
}
]
}
POLICY
)"
echo "✅ Private Access IAM policy attached to AutohealReadOnlyRole"
$policy = @"
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "SSMTunnel",
"Effect": "Allow",
"Action": [
"ssm:StartSession",
"ssm:TerminateSession"
],
"Resource": [
"arn:aws:ec2:$($env:AWS_REGION):$($env:AWS_ACCOUNT_ID):instance/$BASTION_INSTANCE_ID",
"arn:aws:ssm:*:*:document/AWS-StartPortForwardingSessionToRemoteHost"
]
},
{
"Sid": "EKSDescribe",
"Effect": "Allow",
"Action": [
"eks:ListClusters",
"eks:DescribeCluster"
],
"Resource": "*"
}
]
}
"@
aws iam put-role-policy `
--role-name AutohealReadOnlyRole `
--policy-name AutohealPrivateAccessPolicy `
--policy-document $policy
Write-Host "✅ Private Access IAM policy attached to AutohealReadOnlyRole"
Verify the policy was added:
- macOS / Linux
- Windows (PowerShell)
aws iam get-role-policy \
--role-name AutohealReadOnlyRole \
--policy-name AutohealPrivateAccessPolicy \
--query 'PolicyDocument.Statement[*].{Sid:Sid,Actions:Action}' \
--output table
aws iam get-role-policy `
--role-name AutohealReadOnlyRole `
--policy-name AutohealPrivateAccessPolicy `
--query 'PolicyDocument.Statement[*].{Sid:Sid,Actions:Action}' `
--output table
Both resources in the SSMTunnel statement are required for port-forwarding sessions:
- The EC2 instance ARN — allows starting a session on this specific bastion
- The SSM document ARN — allows using the
AWS-StartPortForwardingSessionToRemoteHostdocument type
ssm:StartSession on the instance ARN alone is not sufficient and will result in AccessDeniedException.
Step 7 — Authorize the Autoheal Role in the EKS Cluster
kubectl commands running inside the Autoheal sandbox authenticate using the Autoheal IAM role. That role must be authorized to access the Kubernetes API.
Check which authentication mode your cluster uses:
- macOS / Linux
- Windows (PowerShell)
aws eks describe-cluster \
--name "${EKS_CLUSTER_NAME}" \
--region "${AWS_REGION}" \
--query 'cluster.accessConfig.authenticationMode' \
--output text
aws eks describe-cluster `
--name $env:EKS_CLUSTER_NAME `
--region $env:AWS_REGION `
--query 'cluster.accessConfig.authenticationMode' `
--output text
- API or API_AND_CONFIG_MAP (recommended)
- CONFIG_MAP (legacy)
If the output is API or API_AND_CONFIG_MAP, use EKS Access Entries — the modern approach that requires no kubectl access to configure.
- macOS / Linux
- Windows (PowerShell)
# Create an access entry for the Autoheal IAM role
aws eks create-access-entry \
--cluster-name "${EKS_CLUSTER_NAME}" \
--principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
--region "${AWS_REGION}"
# Grant read-only access to all cluster resources
aws eks associate-access-policy \
--cluster-name "${EKS_CLUSTER_NAME}" \
--principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
--policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy \
--access-scope type=cluster \
--region "${AWS_REGION}"
echo "✅ Autoheal role authorized in ${EKS_CLUSTER_NAME}"
# Create an access entry for the Autoheal IAM role
aws eks create-access-entry `
--cluster-name $env:EKS_CLUSTER_NAME `
--principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
--region $env:AWS_REGION
# Grant read-only access to all cluster resources
aws eks associate-access-policy `
--cluster-name $env:EKS_CLUSTER_NAME `
--principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
--policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy `
--access-scope type=cluster `
--region $env:AWS_REGION
Write-Host "✅ Autoheal role authorized in $($env:EKS_CLUSTER_NAME)"
Verify the access entry was created:
- macOS / Linux
- Windows (PowerShell)
aws eks describe-access-entry \
--cluster-name "${EKS_CLUSTER_NAME}" \
--principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
--region "${AWS_REGION}" \
--query 'accessEntry.{principal:principalArn,type:type,createdAt:createdAt}'
aws eks describe-access-entry `
--cluster-name $env:EKS_CLUSTER_NAME `
--principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
--region $env:AWS_REGION `
--query 'accessEntry.{principal:principalArn,type:type,createdAt:createdAt}'
AmazonEKSViewPolicy grants read-only access to pods, services, deployments, nodes, events, and most other Kubernetes resources. This is sufficient for Autoheal to inspect cluster state during an investigation.
To restrict to specific namespaces instead of the whole cluster:
- macOS / Linux
- Windows (PowerShell)
aws eks associate-access-policy \
--cluster-name "${EKS_CLUSTER_NAME}" \
--principal-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole" \
--policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy \
--access-scope type=namespace \
--namespaces default production \
--region "${AWS_REGION}"
aws eks associate-access-policy `
--cluster-name $env:EKS_CLUSTER_NAME `
--principal-arn "arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole" `
--policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy `
--access-scope type=namespace `
--namespaces default production `
--region $env:AWS_REGION
If the output is CONFIG_MAP, edit the aws-auth ConfigMap directly. You need kubectl configured with admin access to your cluster.
First, configure kubectl to talk to the cluster:
- macOS / Linux
- Windows (PowerShell)
aws eks update-kubeconfig \
--name "${EKS_CLUSTER_NAME}" \
--region "${AWS_REGION}"
aws eks update-kubeconfig `
--name $env:EKS_CLUSTER_NAME `
--region $env:AWS_REGION
Then apply an RBAC binding that maps the Autoheal role to read-only Kubernetes access:
- macOS / Linux
- Windows (PowerShell)
# Create a ClusterRoleBinding that gives the Autoheal user read-only access
kubectl create clusterrolebinding autoheal-view \
--clusterrole=view \
--user=autoheal
# Create a ClusterRoleBinding that gives the Autoheal user read-only access
kubectl create clusterrolebinding autoheal-view `
--clusterrole=view `
--user=autoheal
Then add the role mapping to the aws-auth ConfigMap. The safest way is to patch it:
- macOS / Linux
- Windows (PowerShell)
# Export the current ConfigMap
kubectl get configmap aws-auth -n kube-system -o yaml > /tmp/aws-auth-backup.yaml
cat /tmp/aws-auth-backup.yaml # review current state
# Add the Autoheal role mapping
kubectl patch configmap aws-auth -n kube-system --type merge -p "
data:
mapRoles: |
$(kubectl get configmap aws-auth -n kube-system -o jsonpath='{.data.mapRoles}' | head -c -1)
- rolearn: arn:aws:iam::${AWS_ACCOUNT_ID}:role/AutohealReadOnlyRole
username: autoheal
groups: []
"
# Export the current ConfigMap
kubectl get configmap aws-auth -n kube-system -o yaml > $env:TEMP\aws-auth-backup.yaml
Get-Content $env:TEMP\aws-auth-backup.yaml # review current state
# Add the Autoheal role mapping
$existingRoles = kubectl get configmap aws-auth -n kube-system `
-o jsonpath='{.data.mapRoles}'
$patch = @"
data:
mapRoles: |
$existingRoles
- rolearn: arn:aws:iam::$($env:AWS_ACCOUNT_ID):role/AutohealReadOnlyRole
username: autoheal
groups: []
"@
kubectl patch configmap aws-auth -n kube-system --type merge -p $patch
The patch approach above appends to the existing mapRoles. If the formatting breaks, restore from backup:
- macOS / Linux
- Windows (PowerShell)
kubectl apply -f /tmp/aws-auth-backup.yaml
kubectl apply -f $env:TEMP\aws-auth-backup.yaml
Alternatively, edit the ConfigMap directly with kubectl edit configmap aws-auth -n kube-system and add the entry manually.
The groups: [] is intentional — the view ClusterRoleBinding above maps the autoheal username directly, not via a group. Do not use system:masters unless you intend to grant full cluster-admin access.
Verify the mapping is in place:
- macOS / Linux
- Windows (PowerShell)
kubectl get configmap aws-auth -n kube-system -o jsonpath='{.data.mapRoles}' | grep -A3 AutohealReadOnlyRole
kubectl get configmap aws-auth -n kube-system -o jsonpath='{.data.mapRoles}' |
Select-String -Pattern "AutohealReadOnlyRole" -Context 0,3
Step 8 — Create the Integration in Autoheal
- Go to Integrations in Autoheal
- Click Private Access → AWS SSM Tunnel
- Fill in the fields:
- AWS Integration — select the AWS integration configured with OIDC federation (e.g., "Production AWS")
- Bastion Instance ID — paste the instance ID from Step 3 (e.g.,
i-0dc03fbcae2a87472) - AWS Region — the region where your bastion and EKS clusters reside (e.g.,
us-east-1) - EKS Clusters — select from the dropdown; Autoheal lists all clusters your AWS integration can see in the specified region
- Click Save
The integration will show as active within ~30 seconds once the SSM tunnel is established.
How Traffic Flows
Autoheal Sandbox
│
│ kubectl get pods (hits localhost:10100)
│
▼
localhost:10100 ──── SSM port-forward ────► Bastion EC2 ────► EKS API server :443
Ports are assigned sequentially starting at 10100. For multiple clusters:
- First cluster →
localhost:10100 - Second cluster →
localhost:10101
The kubeconfig is written automatically with the correct context names and server addresses.
Troubleshooting
Integration shows 'starting' but never becomes active
The setup process prints tunnel ready once all SSM tunnels are established. If the integration stays in starting for more than 60 seconds:
Check SSM status:
- macOS / Linux
- Windows (PowerShell)
aws ssm describe-instance-information \
--filters "Key=InstanceIds,Values=${BASTION_INSTANCE_ID}" \
--region "${AWS_REGION}" \
--query 'InstanceInformationList[0].{status:PingStatus,agentVersion:AgentVersion,ip:IPAddress}' \
--output table
aws ssm describe-instance-information `
--filters "Key=InstanceIds,Values=$BASTION_INSTANCE_ID" `
--region $env:AWS_REGION `
--query 'InstanceInformationList[0].{status:PingStatus,agentVersion:AgentVersion,ip:IPAddress}' `
--output table
Test that SSM can open a session to the bastion:
- macOS / Linux
- Windows (PowerShell)
aws ssm start-session \
--target "${BASTION_INSTANCE_ID}" \
--region "${AWS_REGION}"
aws ssm start-session `
--target $BASTION_INSTANCE_ID `
--region $env:AWS_REGION
If this fails with TargetNotConnected, the SSM agent is not registered — see "Bastion shows Offline" below. Type exit to close the session.
Test bastion → EKS reachability using a port-forward tunnel:
This opens a port-forwarding tunnel through the bastion to the EKS API server, then checks if the API server responds:
- macOS / Linux
- Windows (PowerShell)
# Get the EKS endpoint hostname
EKS_ENDPOINT=$(aws eks describe-cluster \
--name "${EKS_CLUSTER_NAME}" \
--region "${AWS_REGION}" \
--query 'cluster.endpoint' --output text | sed 's|https://||')
# Open a port-forward tunnel: localhost:19999 → bastion → EKS API:443
SSM_PARAMS='{"host":["'"${EKS_ENDPOINT}"'"],"portNumber":["443"],"localPortNumber":["19999"]}'
aws ssm start-session \
--target "${BASTION_INSTANCE_ID}" \
--document-name AWS-StartPortForwardingSessionToRemoteHost \
--parameters "${SSM_PARAMS}" \
--region "${AWS_REGION}" &
SSM_PID=$!
sleep 3 # wait for tunnel to establish
# Check if the EKS API server responds through the tunnel
curl -k -o /dev/null -w "HTTP status: %{http_code}\n" https://127.0.0.1:19999
kill "${SSM_PID}" # close the tunnel
# Get the EKS endpoint hostname
$EKS_ENDPOINT = (aws eks describe-cluster `
--name $env:EKS_CLUSTER_NAME `
--region $env:AWS_REGION `
--query 'cluster.endpoint' --output text) -replace 'https://',''
# Open a port-forward tunnel: localhost:19999 → bastion → EKS API:443
$SSM_PARAMS = '{"host":["' + $EKS_ENDPOINT + '"],"portNumber":["443"],"localPortNumber":["19999"]}'
$ssmProc = Start-Process -NoNewWindow -PassThru aws -ArgumentList `
"ssm","start-session",
"--target",$BASTION_INSTANCE_ID,
"--document-name","AWS-StartPortForwardingSessionToRemoteHost",
"--parameters",$SSM_PARAMS,
"--region",$env:AWS_REGION
Start-Sleep -Seconds 3 # wait for tunnel to establish
# Check if the EKS API server responds through the tunnel
curl.exe -k -o NUL -w "HTTP status: %{http_code}\n" https://127.0.0.1:19999
$ssmProc.Kill() # close the tunnel
Expected: HTTP status: 403 — the EKS API server responds (403 = unauthenticated, which is correct). If the curl hangs or the SSM command fails, the security group rule from Step 4 is missing or incorrect.
kubectl commands fail with 'Unauthorized' or 'Forbidden'
The SSM tunnel is working, but the Autoheal IAM role is not authorized in the EKS cluster.
Check access entries (modern clusters):
- macOS / Linux
- Windows (PowerShell)
aws eks list-access-entries \
--cluster-name "${EKS_CLUSTER_NAME}" \
--region "${AWS_REGION}" \
--query 'accessEntries' \
--output table
aws eks list-access-entries `
--cluster-name $env:EKS_CLUSTER_NAME `
--region $env:AWS_REGION `
--query 'accessEntries' `
--output table
Check aws-auth ConfigMap (legacy clusters):
kubectl get configmap aws-auth -n kube-system -o yaml
Confirm arn:aws:iam::ACCOUNT:role/AutohealReadOnlyRole appears under mapRoles.
Check the RBAC binding:
kubectl get clusterrolebinding autoheal-view -o yaml
Verify the exact role ARN — OIDC-assumed roles sometimes include a session suffix in CloudTrail, but the access entry must match the base role ARN exactly:
- macOS / Linux
- Windows (PowerShell)
aws iam get-role \
--role-name AutohealReadOnlyRole \
--query 'Role.Arn' \
--output text
aws iam get-role `
--role-name AutohealReadOnlyRole `
--query 'Role.Arn' `
--output text
EKS clusters dropdown is empty
The Autoheal role cannot list EKS clusters.
Verify the IAM policy was attached:
- macOS / Linux
- Windows (PowerShell)
aws iam get-role-policy \
--role-name AutohealReadOnlyRole \
--policy-name AutohealPrivateAccessPolicy \
--query 'PolicyDocument.Statement[?Sid==`EKSDescribe`].Action' \
--output table
aws iam get-role-policy `
--role-name AutohealReadOnlyRole `
--policy-name AutohealPrivateAccessPolicy `
--query 'PolicyDocument.Statement[?Sid==`EKSDescribe`].Action' `
--output table
Test listing clusters using the role's effective permissions:
- macOS / Linux
- Windows (PowerShell)
aws eks list-clusters --region "${AWS_REGION}" --output table
aws eks list-clusters --region $env:AWS_REGION --output table
If this works from your terminal but the dropdown is empty, verify the region field in the Autoheal integration form matches the region where your clusters are deployed.
Bastion shows Offline in SSM
The SSM agent cannot reach AWS SSM endpoints.
Check the instance profile is attached:
- macOS / Linux
- Windows (PowerShell)
aws ec2 describe-instances \
--instance-ids "${BASTION_INSTANCE_ID}" \
--region "${AWS_REGION}" \
--query 'Reservations[0].Instances[0].IamInstanceProfile.Arn' \
--output text
aws ec2 describe-instances `
--instance-ids $BASTION_INSTANCE_ID `
--region $env:AWS_REGION `
--query 'Reservations[0].Instances[0].IamInstanceProfile.Arn' `
--output text
Expected: an ARN containing AutohealBastionProfile. If None, attach the profile:
- macOS / Linux
- Windows (PowerShell)
aws ec2 associate-iam-instance-profile \
--instance-id "${BASTION_INSTANCE_ID}" \
--iam-instance-profile Name=AutohealBastionProfile \
--region "${AWS_REGION}"
aws ec2 associate-iam-instance-profile `
--instance-id $BASTION_INSTANCE_ID `
--iam-instance-profile Name=AutohealBastionProfile `
--region $env:AWS_REGION
Check the role is attached to the profile:
- macOS / Linux
- Windows (PowerShell)
aws iam get-instance-profile \
--instance-profile-name AutohealBastionProfile \
--query 'InstanceProfile.Roles[*].RoleName' \
--output text
aws iam get-instance-profile `
--instance-profile-name AutohealBastionProfile `
--query 'InstanceProfile.Roles[*].RoleName' `
--output text
Check outbound 443 from the security group:
- macOS / Linux
- Windows (PowerShell)
aws ec2 describe-security-groups \
--group-ids "${BASTION_SG}" \
--region "${AWS_REGION}" \
--query 'SecurityGroups[0].IpPermissionsEgress'
aws ec2 describe-security-groups `
--group-ids $BASTION_SG `
--region $env:AWS_REGION `
--query 'SecurityGroups[0].IpPermissionsEgress'
There should be a rule for TCP port 443 to 0.0.0.0/0. If missing, add it:
- macOS / Linux
- Windows (PowerShell)
aws ec2 authorize-security-group-egress \
--group-id "${BASTION_SG}" \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0 \
--region "${AWS_REGION}"
aws ec2 authorize-security-group-egress `
--group-id $BASTION_SG `
--protocol tcp `
--port 443 `
--cidr 0.0.0.0/0 `
--region $env:AWS_REGION
The bastion needs outbound 443 to reach AWS SSM service endpoints. In a private subnet with no NAT Gateway, the SSM agent cannot reach these endpoints. Options:
- Add a NAT Gateway to the subnet's route table
- Add SSM VPC endpoints (
ssm,ssmmessages,ec2messages) to the VPC
'AccessDeniedException' when starting SSM session
The Autoheal role's ssm:StartSession permission is missing or scoped incorrectly.
Verify both required resources are in the policy:
- macOS / Linux
- Windows (PowerShell)
aws iam get-role-policy \
--role-name AutohealReadOnlyRole \
--policy-name AutohealPrivateAccessPolicy \
--query 'PolicyDocument.Statement[?Sid==`SSMTunnel`].Resource' \
--output table
aws iam get-role-policy `
--role-name AutohealReadOnlyRole `
--policy-name AutohealPrivateAccessPolicy `
--query 'PolicyDocument.Statement[?Sid==`SSMTunnel`].Resource' `
--output table
The output must include both:
arn:aws:ec2:REGION:ACCOUNT:instance/INSTANCE_IDarn:aws:ssm:*:*:document/AWS-StartPortForwardingSessionToRemoteHost
ssm:StartSession on the instance ARN alone is not sufficient for port-forwarding — AWS requires explicit permission on the SSM document as well.
If the instance ID in the policy doesn't match the bastion, re-run the Step 6 command with the correct BASTION_INSTANCE_ID.
Security Notes
- No inbound firewall rules needed: The bastion has zero inbound rules. All connectivity is outbound-only, initiated by the SSM agent.
- No SSH, no public IP: The bastion does not expose SSH. It has no public IP address.
- Temporary credentials only: The Autoheal OIDC role issues short-lived credentials (typically 1 hour) via STS. No long-lived keys are stored anywhere.
- Least privilege: The
ssm:StartSessionresource is scoped to the specific bastion instance ARN. The EKS access entry usesAmazonEKSViewPolicy(read-only), notsystem:masters. - Audit trail: Every SSM session and kubectl API call appears in AWS CloudTrail under the Autoheal role identity.