AWS Integration
Connect your AWS account to enable the AI agent to query CloudWatch logs and metrics, inspect EC2 instances, review ECS/EKS workloads, and monitor ElastiCache clusters during investigations.
Capabilities
Once connected, the AI agent can:
| Capability | Description |
|---|---|
| CloudWatch Logs | Search and retrieve log events across log groups |
| CloudWatch Metrics | Query metric data points and view alarm states |
| EC2 Instances | List instances, view details and status checks |
| ECS Services | Inspect clusters, services, tasks, and deployments |
| EKS Clusters | Review cluster configuration and node groups |
| ElastiCache | Inspect Redis/Memcached cluster infrastructure: topology, replication groups, events, and CloudWatch metrics. Infrastructure-level only; does not read or write cached data (keys, values). |
Authentication Methods
Autoheal supports two ways to connect to AWS depending on your deployment:
| Method | Best For | Credentials |
|---|---|---|
| OIDC Federation | SaaS and BYOC | Temporary via STS, no keys stored |
| Instance Role | BYOC only (Compose on EC2) | Inherited from EC2 instance role |
OIDC Federation
Autoheal exchanges a short-lived OIDC token for temporary AWS credentials via STS at runtime. You create an IAM OIDC Identity Provider and an IAM Role. No long-lived credentials are stored. Available on SaaS and BYOC deployments.
Prerequisites
- An AWS account with IAM administrative access
- Permission to create IAM OIDC Identity Providers and IAM Roles
- Your 12-digit AWS Account ID
- Your Autoheal tenant slug (used as part of the OIDC audience value)
Your tenant slug is a short identifier for your organization in Autoheal (e.g., acme-corp). You can find it in Settings → Organization in Autoheal. The OIDC audience value used throughout this guide follows the pattern {tenant-slug}-oidc-service, for example acme-corp-oidc-service.
Setup
- AWS CLI (Recommended)
- AWS Console (UI)
Set up everything from your terminal in under 2 minutes.
Replace 123456789012 with your actual 12-digit AWS Account ID and your-tenant-slug with your Autoheal tenant slug:
export AWS_ACCOUNT_ID="123456789012"
export AUTOHEAL_TENANT_SLUG="your-tenant-slug"
You can find your Account ID in the AWS Console (top-right dropdown) or by running aws sts get-caller-identity --query Account --output text. Your tenant slug is available in Autoheal under Settings → Organization (e.g., acme-corp).
This tells AWS to trust tokens issued by Autoheal:
aws iam create-open-id-connect-provider \
--url "https://app.autoheal.ai" \
--client-id-list "${AUTOHEAL_TENANT_SLUG}-oidc-service" \
--thumbprint-list "0000000000000000000000000000000000000000"
The audience value is your tenant-specific OIDC identifier, following the pattern {tenant-slug}-oidc-service (e.g., acme-corp-oidc-service). The --thumbprint-list parameter is required by the API but AWS no longer uses it for validation. Any valid 40-character hex string works.
Create an IAM policy and attach it to the role. The example below is a good starting point. Feel free to grant more or fewer permissions depending on what you'd like Autoheal to access:
Example IAM Policy (click to expand)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CloudWatchLogs",
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:FilterLogEvents",
"logs:GetLogEvents",
"logs:StartQuery",
"logs:GetQueryResults"
],
"Resource": "*"
},
{
"Sid": "CloudWatchMetrics",
"Effect": "Allow",
"Action": [
"cloudwatch:ListMetrics",
"cloudwatch:GetMetricData",
"cloudwatch:DescribeAlarms"
],
"Resource": "*"
},
{
"Sid": "EC2ReadOnly",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs"
],
"Resource": "*"
},
{
"Sid": "ECSReadOnly",
"Effect": "Allow",
"Action": [
"ecs:ListClusters",
"ecs:DescribeClusters",
"ecs:ListServices",
"ecs:DescribeServices",
"ecs:ListTasks",
"ecs:DescribeTasks"
],
"Resource": "*"
},
{
"Sid": "EKSReadOnly",
"Effect": "Allow",
"Action": [
"eks:ListClusters",
"eks:DescribeCluster",
"eks:ListNodegroups",
"eks:DescribeNodegroup"
],
"Resource": "*"
},
{
"Sid": "ElastiCacheReadOnly",
"Effect": "Allow",
"Action": [
"elasticache:DescribeCacheClusters",
"elasticache:DescribeReplicationGroups",
"elasticache:DescribeServerlessCaches",
"elasticache:DescribeEvents"
],
"Resource": "*"
},
{
"Sid": "STSIdentity",
"Effect": "Allow",
"Action": [
"sts:GetCallerIdentity"
],
"Resource": "*"
}
]
}
aws iam create-policy \
--policy-name AutohealReadOnlyPolicy \
--policy-document file://autoheal-policy.json
You can also attach the AWS managed policy arn:aws:iam::aws:policy/ReadOnlyAccess if you prefer broad read-only access without creating a custom policy. You can further scope Resource to specific ARNs (e.g., particular log groups or clusters) if you want tighter control.
Create the role that Autoheal will assume via OIDC federation:
aws iam create-role \
--role-name AutohealReadOnlyRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::'$AWS_ACCOUNT_ID':oidc-provider/app.autoheal.ai"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"app.autoheal.ai:aud": "'$AUTOHEAL_TENANT_SLUG'-oidc-service"
}
}
}
]
}'
aws iam attach-role-policy \
--role-name AutohealReadOnlyRole \
--policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/AutohealReadOnlyPolicy"
aws iam get-role \
--role-name AutohealReadOnlyRole \
--query 'Role.Arn' --output text
Copy the output. You'll paste it into Autoheal in the next step.
- Go to Integrations in Autoheal
- Click Amazon Web Services
- Enter a name (e.g., "Production AWS")
- Fill in:
- AWS Account ID: Your 12-digit account ID
- IAM Role ARN: The role ARN from the previous step
- AWS Region (optional): The default region for tool calls (e.g.,
us-east-1). Defaults tous-east-1if left blank.
- Click Test Connection to verify, then Save
This tells AWS to trust tokens issued by Autoheal.
- Open the IAM Console → Identity providers → Add provider
- Select OpenID Connect
- Enter the following:
- Provider URL:
https://app.autoheal.ai - Audience:
{your-tenant-slug}-oidc-service(e.g.,acme-corp-oidc-service)
- Provider URL:
- Click Get thumbprint, then Add provider
The audience value is your tenant-specific OIDC identifier. Replace {your-tenant-slug} with your Autoheal tenant slug, which you can find in Settings → Organization. This ensures only your Autoheal tenant can assume the role.
- Go to IAM → Roles → Create role
- Select Web identity as the trusted entity type
- Choose the identity provider you just created (
app.autoheal.ai) - Select audience:
{your-tenant-slug}-oidc-service - Click Next and attach the permissions policy (see next step)
- Name the role (e.g.,
AutohealReadOnlyRole) and create it
After creation, copy the Role ARN (e.g., arn:aws:iam::123456789012:role/AutohealReadOnlyRole).
Attach an IAM policy to the role. You can use the AWS managed policy ReadOnlyAccess for broad access, or create a custom policy scoped to the services Autoheal uses.
Example custom policy (click to expand)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CloudWatchLogs",
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:FilterLogEvents",
"logs:GetLogEvents",
"logs:StartQuery",
"logs:GetQueryResults"
],
"Resource": "*"
},
{
"Sid": "CloudWatchMetrics",
"Effect": "Allow",
"Action": [
"cloudwatch:ListMetrics",
"cloudwatch:GetMetricData",
"cloudwatch:DescribeAlarms"
],
"Resource": "*"
},
{
"Sid": "EC2ReadOnly",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs"
],
"Resource": "*"
},
{
"Sid": "ECSReadOnly",
"Effect": "Allow",
"Action": [
"ecs:ListClusters",
"ecs:DescribeClusters",
"ecs:ListServices",
"ecs:DescribeServices",
"ecs:ListTasks",
"ecs:DescribeTasks"
],
"Resource": "*"
},
{
"Sid": "EKSReadOnly",
"Effect": "Allow",
"Action": [
"eks:ListClusters",
"eks:DescribeCluster",
"eks:ListNodegroups",
"eks:DescribeNodegroup"
],
"Resource": "*"
},
{
"Sid": "ElastiCacheReadOnly",
"Effect": "Allow",
"Action": [
"elasticache:DescribeCacheClusters",
"elasticache:DescribeReplicationGroups",
"elasticache:DescribeServerlessCaches",
"elasticache:DescribeEvents"
],
"Resource": "*"
},
{
"Sid": "STSIdentity",
"Effect": "Allow",
"Action": [
"sts:GetCallerIdentity"
],
"Resource": "*"
}
]
}
You can scope Resource to specific ARNs (e.g., particular log groups or clusters) if you want finer-grained control.
- Go to Integrations in Autoheal
- Click Amazon Web Services
- Enter a name (e.g., "Production AWS")
- Fill in:
- AWS Account ID: Your 12-digit account ID (e.g.,
123456789012) - IAM Role ARN: The role ARN from the previous step
- AWS Region (optional): The default region for tool calls. Defaults to
us-east-1if left blank.
- AWS Account ID: Your 12-digit account ID (e.g.,
- Click Test Connection to verify, then Save
Trust Policy Reference
The IAM Role's trust policy should look like this (created automatically by the IAM wizard):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::YOUR_ACCOUNT_ID:oidc-provider/app.autoheal.ai"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"app.autoheal.ai:aud": "YOUR_TENANT_SLUG-oidc-service"
}
}
}
]
}
Replace YOUR_ACCOUNT_ID with your 12-digit AWS account ID and YOUR_TENANT_SLUG with your Autoheal tenant slug (e.g., acme-corp-oidc-service).
Network Requirements
This method communicates entirely over public HTTPS endpoints. No VPC peering or private connectivity is required.
| Direction | From | To | Purpose |
|---|---|---|---|
| AWS → Autoheal | AWS STS | https://app.autoheal.ai/.well-known/openid-configuration | AWS validates the OIDC token by fetching Autoheal's signing keys |
| Autoheal → AWS | Autoheal platform | sts.amazonaws.com (or regional STS endpoints) | Exchanges OIDC token for temporary AWS credentials |
| Autoheal → AWS | Autoheal platform | Regional AWS service endpoints | Executes read-only API calls using temporary credentials |
If your AWS account uses Service Control Policies (SCPs) or VPC endpoint policies, ensure they do not block sts:AssumeRoleWithWebIdentity or the read-only actions granted to the Autoheal role.
Instance Role (BYOC)
Instance Role authentication is only available for self-hosted (BYOC) deployments running on EC2. If you are using Autoheal Cloud (SaaS), use OIDC Federation instead.
Autoheal inherits the IAM role attached to the EC2 instance and injects temporary credentials into agent sandboxes at runtime.
Prerequisites
- Autoheal deployed via Docker Compose on EC2
- An IAM Role attached to the EC2 instance with the required permissions
- Your 12-digit AWS Account ID
- A target IAM Role ARN for cross-account access (Optional)
Setup
- AWS CLI (Recommended)
- AWS Console (UI)
export AWS_ACCOUNT_ID="123456789012"
Create a read-only IAM policy scoped to the services Autoheal uses. Feel free to grant more or fewer permissions depending on what you'd like Autoheal to access:
Example IAM Policy (click to expand)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CloudWatchLogs",
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:FilterLogEvents",
"logs:GetLogEvents",
"logs:StartQuery",
"logs:GetQueryResults"
],
"Resource": "*"
},
{
"Sid": "CloudWatchMetrics",
"Effect": "Allow",
"Action": [
"cloudwatch:ListMetrics",
"cloudwatch:GetMetricData",
"cloudwatch:DescribeAlarms"
],
"Resource": "*"
},
{
"Sid": "EC2ReadOnly",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs"
],
"Resource": "*"
},
{
"Sid": "ECSReadOnly",
"Effect": "Allow",
"Action": [
"ecs:ListClusters",
"ecs:DescribeClusters",
"ecs:ListServices",
"ecs:DescribeServices",
"ecs:ListTasks",
"ecs:DescribeTasks"
],
"Resource": "*"
},
{
"Sid": "EKSReadOnly",
"Effect": "Allow",
"Action": [
"eks:ListClusters",
"eks:DescribeCluster",
"eks:ListNodegroups",
"eks:DescribeNodegroup"
],
"Resource": "*"
},
{
"Sid": "ElastiCacheReadOnly",
"Effect": "Allow",
"Action": [
"elasticache:DescribeCacheClusters",
"elasticache:DescribeReplicationGroups",
"elasticache:DescribeServerlessCaches",
"elasticache:DescribeEvents"
],
"Resource": "*"
},
{
"Sid": "STSIdentity",
"Effect": "Allow",
"Action": [
"sts:GetCallerIdentity"
],
"Resource": "*"
}
]
}
aws iam create-policy \
--policy-name AutohealReadOnlyPolicy \
--policy-document file://autoheal-policy.json
You can also attach the AWS managed policy arn:aws:iam::aws:policy/ReadOnlyAccess for broad read-only access.
aws iam attach-role-policy \
--role-name YOUR_EC2_INSTANCE_ROLE_NAME \
--policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/AutohealReadOnlyPolicy"
Verify the role is attached to the instance:
curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/
This should return the role name. If it returns a 404, no role is attached.
- Go to Integrations in Autoheal
- Click Amazon Web Services
- Enter a name (e.g., "Production AWS")
- Select Instance Role as the authentication method
- Fill in:
- AWS Account ID: Your 12-digit account ID
- AWS Region (optional): Default region for tool calls (defaults to
us-east-1) - IAM Role ARN (optional): Only for cross-account access (see below)
- Click Save
- Open the IAM Console → Policies → Create policy
- Switch to the JSON editor and paste the policy below. Feel free to add or remove permissions depending on what you'd like Autoheal to access.
Example custom policy (click to expand)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CloudWatchLogs",
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:FilterLogEvents",
"logs:GetLogEvents",
"logs:StartQuery",
"logs:GetQueryResults"
],
"Resource": "*"
},
{
"Sid": "CloudWatchMetrics",
"Effect": "Allow",
"Action": [
"cloudwatch:ListMetrics",
"cloudwatch:GetMetricData",
"cloudwatch:DescribeAlarms"
],
"Resource": "*"
},
{
"Sid": "EC2ReadOnly",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs"
],
"Resource": "*"
},
{
"Sid": "ECSReadOnly",
"Effect": "Allow",
"Action": [
"ecs:ListClusters",
"ecs:DescribeClusters",
"ecs:ListServices",
"ecs:DescribeServices",
"ecs:ListTasks",
"ecs:DescribeTasks"
],
"Resource": "*"
},
{
"Sid": "EKSReadOnly",
"Effect": "Allow",
"Action": [
"eks:ListClusters",
"eks:DescribeCluster",
"eks:ListNodegroups",
"eks:DescribeNodegroup"
],
"Resource": "*"
},
{
"Sid": "ElastiCacheReadOnly",
"Effect": "Allow",
"Action": [
"elasticache:DescribeCacheClusters",
"elasticache:DescribeReplicationGroups",
"elasticache:DescribeServerlessCaches",
"elasticache:DescribeEvents"
],
"Resource": "*"
},
{
"Sid": "STSIdentity",
"Effect": "Allow",
"Action": [
"sts:GetCallerIdentity"
],
"Resource": "*"
}
]
}
- Name the policy (e.g.,
AutohealReadOnlyPolicy) and create it
You can also attach the AWS managed policy ReadOnlyAccess for broad read-only access.
- Go to IAM → Roles → find the IAM Role attached to your EC2 instance
- Click Add permissions → Attach policies
- Search for and attach the policy you just created
To verify the role is attached to the instance:
- Go to EC2 → Instances → select your instance
- Check the IAM Role field in the instance details
- Go to Integrations in Autoheal
- Click Amazon Web Services
- Enter a name (e.g., "Production AWS")
- Select Instance Role as the authentication method
- Fill in:
- AWS Account ID: Your 12-digit account ID
- AWS Region (optional): Default region for tool calls (defaults to
us-east-1) - IAM Role ARN (optional): Only for cross-account access (see below)
- Click Save
Cross-Account Access
If Autoheal runs in one account (Account A) but you need to access resources in another account (Account B), provide the target account's IAM Role ARN in the integration settings.
The instance role in Account A will call sts:AssumeRole to obtain temporary credentials for Account B. This requires:
-
The instance role (Account A) has
sts:AssumeRolepermission for the target role:{
"Sid": "CrossAccountAssumeRole",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::ACCOUNT_B_ID:role/AutohealCrossAccountRole"
} -
The target role (Account B) has a trust policy allowing the instance role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::ACCOUNT_A_ID:role/YourInstanceRoleName"
},
"Action": "sts:AssumeRole"
}
]
}
Network Requirements
| Direction | From | To | Purpose |
|---|---|---|---|
| Host → AWS | Autoheal API server | sts.amazonaws.com (or regional STS endpoints) | Only needed for cross-account AssumeRole |
| Sandbox → AWS | Agent sandbox | Regional AWS service endpoints (e.g., logs.{region}.amazonaws.com) | Executes read-only API calls using injected credentials |
The Autoheal API server fetches credentials from IMDS on the host and injects them into the agent sandbox as environment variables.
Multi-Region Access
The AWS Region field sets the default region for tool calls (CloudWatch, EC2, ECS, EKS, ElastiCache queries). It is optional. If left blank, us-east-1 is used.
To monitor resources in multiple regions, create separate integrations for each region:
- "AWS US East" →
us-east-1 - "AWS EU West" →
eu-west-1
All integrations can use the same IAM Role. The role is not region-specific.
Example Queries
Once connected, you can ask the AI agent:
Show me error logs from the payment-service log group in the last hour
What CloudWatch alarms are currently in ALARM state?
List all EC2 instances tagged with Environment=production
Show me the ECS services in the main cluster and their deployment status
What EKS node groups are running and what's their scaling configuration?
List all ElastiCache clusters and their status
Describe the replication group cell-a-redis, including its failover and topology
What is the CPU utilization of my ElastiCache cluster over the last hour?
Troubleshooting
OIDC: Test Connection fails with 'AccessDenied'
- Verify the IAM Role ARN is correct
- Check the trust policy has
app.autoheal.aias the federated principal - Ensure the audience condition matches your tenant-specific value (
{your-tenant-slug}-oidc-service)
OIDC: 'InvalidIdentityToken' error
- The OIDC token may have expired. Retry the connection
- Verify the IAM OIDC Identity Provider URL is exactly
https://app.autoheal.ai(no trailing slash)
Instance Role: cross-account AssumeRole fails
- Verify the instance role has
sts:AssumeRolepermission for the target role ARN - Check the target account's role trust policy allows the instance role as a principal
- Ensure the target role ARN is correctly entered in the integration settings
Can query some services but not others
- Review the IAM policy attached to the role
- Ensure all required permissions are granted
- Check for Service Control Policies (SCPs) that might restrict access
No data returned for a region
- Verify you selected the correct region when creating the integration
- Check that resources exist in that region
- For CloudWatch Logs, ensure log groups are in the selected region
How to revoke Autoheal access
OIDC Federation: Delete the IAM Role and OIDC provider.
export AWS_ACCOUNT_ID="123456789012"
aws iam detach-role-policy \
--role-name AutohealReadOnlyRole \
--policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/AutohealReadOnlyPolicy"
aws iam delete-role --role-name AutohealReadOnlyRole
aws iam delete-policy \
--policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/AutohealReadOnlyPolicy"
aws iam delete-open-id-connect-provider \
--open-id-connect-provider-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/app.autoheal.ai"
Instance Role: Remove or modify the IAM Role attached to the EC2 instance.
Next Steps
Once your AWS integration is configured, connect the AWS-powered integrations you need. Each one links to this AWS integration for authentication and has its own setup guide covering the additional permissions required:
- PostgreSQL (RDS IAM Auth): Query Amazon RDS or Aurora PostgreSQL databases using IAM authentication