Get Databricks cluster ID (or get cluster link) in a Spark job

I want to get the cluster link (or the cluster ID to manually compose the link) inside a running Spark job.

This will be used to print the link in an alerting message, making it easier for engineers to reach the logs.

Is it possible to achieve that in a Spark job running in Databricks?

Upvotes: 10

Views: 11425

Answers (2)

Alexandre Neukirchen
Alexandre Neukirchen

Reputation: 2783

You can also try find out as shows in this link: How to check Spark configuration from command line?

you can run the command:

sc._conf.getAll()

in the list it returns, search for "clusterId"

Upvotes: 0

Alex Ott
Alex Ott

Reputation: 87174

When Databricks cluster starts, there is a number of Spark configuration properties added. Most of them are having name starting with spark.databricks. - you can find all of the in the Environment tab of the Spark UI.

Cluster ID is available as spark.databricks.clusterUsageTags.clusterId property and you can get it as:

spark.conf.get("spark.databricks.clusterUsageTags.clusterId") 

You can get workspace host name via dbutils.notebook.getContext().apiUrl.get call (for Scala), or dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().get() (for Python)

Upvotes: 16

Related Questions