Tusharjain93
Tusharjain93

Reputation: 131

Databricks API 2.0 - Cluster get response - TEMPORARILY_UNAVAILABLE

I have a spark cluster on Azure Databricks and I am using C# APIs to start the cluster and get the cluster status. This has been working fine for months till Oct 24. Since then I started getting error messages of below format

Cluster Get Response : {"error_code":"TEMPORARILY_UNAVAILABLE","message":"No webapps are available to handle your request. Please try again later."}

My Cluster is in the East US region.

This error keeps coming intermittently while trying to access the cluster state or starting it. I am attaching a sample of errors that I have received in the last few days.

Error message on ClustersListAsync

Error message Cluster Get Response Can anyone please advise how to resolve this issue?

Upvotes: 0

Views: 3336

Answers (1)

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12788

This issue is caused due to outage from Azure.

Summary of impact: Between approximately 11:00 and 14:40 UTC on 25 Oct 2019, a subset of customers using Azure Databricks may have received 'No Web App available' error notifications when logging into a Databricks workspace. Related API calls may have also not returned a response. Additionally, a very limited subset of customers using Data Factory v2 may have received failure notifications for Data Flow jobs.

Preliminary root cause: Engineers determined that a backend database used to process workspace access requests became unhealthy, causing requests to fail. As this database supports the control plane for the East US, East US 2, Central US, and North Central US regions, only customers in these regions would have seen impact. Additionally, a small number of Data Factory v2 customers in these regions would have seen downstream impact from this issue.

Mitigation: Engineers redeployed the affected backend database to mitigate the issue.

Next steps: Engineers will continue to investigate to establish the full root cause and prevent future occurrences. Stay informed about Azure service issues by creating custom service health alerts: https://aka.ms/ash-videos for video tutorials and https://aka.ms/ash-alerts for how-to documentation.

For more details, refer "Azure Service Status History".

Update: Outage on OCT 31

Summary of Impact: Between 00:00 UTC and 00:45 on 31 Oct 2019, engineers entered a maintenance period to mitigate a regression in the latest 3.5 upgrade that may have potentially impacted your ODBC/JDBC services. Engineers performed a hotfix during the maintenance period that took approximately 10 minutes. Databricks Cluster creation may have been briefly impacted while the hotfix was applied, as well as related API calls (create, update, delete, auto-scale). Access to the user interface may have also been briefly impacted. Running jobs or previously created clusters should not have been impacted.

Hope this helps.

Upvotes: 2

Related Questions