Databricks Integration
Connect Databricks to enable the AI agent to query data in SQL Warehouses, monitor ETL job runs, and check cluster health during investigations.
Capabilities
Once connected, the AI agent can:
| Capability | Description |
|---|---|
| SQL Queries | Execute read-only SQL queries against SQL Warehouses |
| Job Monitoring | Monitor ETL and data pipeline job runs |
| Cluster Health | Check Databricks cluster status and health |
| Data Exploration | Browse schemas, tables, and data catalogs |
note
Read-only access: This integration only executes SELECT queries. UPDATE, DELETE, INSERT, and other data-modifying commands are blocked at the query level.
Prerequisites
- A Databricks workspace with API access
- A Personal Access Token (PAT) or OAuth token
- At least read permissions on the data you want to query
Setup
1
Get Your Access Token
- Log in to your Databricks workspace
- Click your profile icon in the top-right corner
- Go to User Settings → Developer → Access tokens
- Click Generate new token
- Give it a description (e.g., "Autoheal Integration") and set an expiration
- Copy the generated token
2
Add Integration in Autoheal
- Go to Integrations in Autoheal
- Click Databricks
- Enter a name (e.g., "Production Databricks")
3
Configure Credentials
Enter the following:
- Workspace URL: Your Databricks workspace URL (e.g.,
https://adb-1234567890.7.azuredatabricks.netorhttps://your-workspace.cloud.databricks.com) - Access Token: Your Personal Access Token (starts with
dapi...)
4
Test and Save
Click Test Connection to verify, then Save.
Required Permissions
The access token should have at least these permissions:
| Permission | Why It's Needed |
|---|---|
CAN_USE on SQL Warehouse | Execute SQL queries |
CAN_VIEW on Jobs | Monitor job runs |
CAN_ATTACH_TO on Clusters | Check cluster status |
SELECT on Tables/Views | Read data from tables |
tip
Create a dedicated service principal or machine user for Autoheal with only read permissions. Avoid using personal tokens with admin access.
Example Queries
Once connected, you can ask the AI agent questions like:
Query the orders table for failed transactions in the last hour
Show me the status of the nightly ETL job
Which Databricks clusters are currently running?
List the top 10 error messages from the application_logs table
Troubleshooting
401 Unauthorized Error
- Verify the access token is correct and has not expired
- Check that the token has the required permissions
- Ensure the workspace URL matches where the token was created
No SQL Warehouse Available
- Verify at least one SQL Warehouse is running in your workspace
- Check that the token has
CAN_USEpermission on the warehouse - Ensure the warehouse is not paused or stopped
Connection Timeout
- Check that the workspace URL is correct and accessible
- Verify network connectivity to Databricks APIs
- Ensure there are no firewall rules blocking access