Reputation: 2470
I want to get the cluster link (or the cluster ID to manually compose the link) inside a running Spark job.
This will be used to print the link in an alerting message, making it easier for engineers to reach the logs.
Is it possible to achieve that in a Spark job running in Databricks?
Upvotes: 10
Views: 11425
Reputation: 2783
You can also try find out as shows in this link: How to check Spark configuration from command line?
you can run the command:
sc._conf.getAll()
in the list it returns, search for "clusterId"
Upvotes: 0
Reputation: 87174
When Databricks cluster starts, there is a number of Spark configuration properties added. Most of them are having name starting with spark.databricks.
- you can find all of the in the Environment
tab of the Spark UI.
Cluster ID is available as spark.databricks.clusterUsageTags.clusterId
property and you can get it as:
spark.conf.get("spark.databricks.clusterUsageTags.clusterId")
You can get workspace host name via dbutils.notebook.getContext().apiUrl.get
call (for Scala), or dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().get()
(for Python)
Upvotes: 16