dr11
dr11

Reputation: 5726

Failing to create Azure Databricks cluster because of unreachable instances

I'm trying to create a cluster in Azure Databricks and getting a such error messgae

Resources were not reachable via SSH. If the problem persists, this usually indicates a network environment misconfiguration. Please check your cloud provider configuration, and make sure that Databricks control plane can reach Spark clusters instances.

I have such the default configuration:

Cluster mode: Standard

Pool: None

Runtime version: 5.5 LTS

Autoscaling enabled

Worker Type: Standard_DS3_v2

Driver Type: Standard_DS3_v2

From Logs Analytics I see Azure tried to create virtual machines and without any reason (I suppose because they were unreachable) had to delete all of them.

Did anyone face such issue?

Upvotes: 2

Views: 10039

Answers (2)

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12768

Issue: Instances Unreachable: Resources were not reachable via SSH.

Possible cause: traffic from control plane to workers is blocked. If you are deploying to an existing virtual network connected to your on-premises network, review your setup using the information supplied in Connect your Azure Databricks Workspace to your On-Premises Network.

Reference: Azure Databricks - Troubleshooting

Hope this helps.

Upvotes: 0

Nancy Xiong
Nancy Xiong

Reputation: 28204

If the issue is temporary, this may be caused by the driver of the virtual machine going down or a networking issue since Azure Databricks was able to launch the cluster, but lost the connection to the instance hosting the Spark driver referring to this. You could try to remove it and create the cluster again.

If the problem persists, this may happen when you have an Azure Databricks workspace deployed to your own VNet. If the virtual network where the workspace is deployed is already peered or has an ExpressRoute connection to on-premises resources, the virtual network cannot make an ssh connection to the cluster node when Azure Databricks is attempting to create a cluster. You could add a user-defined route (UDR) to give the Azure Databricks control plane ssh access to the cluster instances.

For detailed UDR instructions, see Step 3: Create user-defined routes and associate them with your Azure Databricks virtual network subnets. For more VNet-related troubleshooting information, see Troubleshooting

Hope this could help you.

Upvotes: 2

Related Questions