arun
arun

Reputation: 11023

How to find out driver IP in databricks cluster?

Is there a way to find out what the driver IP is on a databricks cluster? The Ganglia UI shows all the nodes on the main page and there doesn't seem to be a way to filter only for the driver.

enter image description here

Upvotes: 3

Views: 16138

Answers (2)

Are we talking about either the internal IP address (the one that the previous answers reply to) or the external IP address (the one that a 3rd party sees, if for example, we invoke an external API from our cluster)?

If we are talking about the second one, I have a humble notebook to illustrate it:

def get_external_ip(x):
    import requests
    import socket

    hostname = socket.gethostname()
    r = requests.get("https://api.ipify.org/")
    public_IP = r.content
    return(f"#{x} From {hostname} with publicIP {public_IP}.")

print('DRIVER:')
rdd1 = get_external_ip(0)
print(rdd1)
print('WORKERS:')   
rdd2 = sc.parallelize(range(1, 4)).map(get_external_ip)
datacoll2 = rdd2.collect()
for row in datacoll2:
    print(row)

It shows you the driver's external IP and the workers' external IPs (please adjust the range according to the workers' node number).

I hope it can be useful.

Upvotes: 6

Kyle Winkelman
Kyle Winkelman

Reputation: 461

You can go into the Spark cluster UI - Master tab within the cluster. The url listed contains IP for the driver and the workers' IPs are listed at the bottom.

Spark cluster UI - Master

Depending on your use case, it may be helpful to know that in an init script you can get the DB_DRIVER_IP from an environment variable. https://docs.databricks.com/clusters/init-scripts.html#environment-variables

There are other environment variables set at runtime that can be accessed in a scala notebook:

System.getenv.get("MASTER")         // spark://10.255.128.6:7077
System.getenv.get("SPARK_LOCAL_IP") // 10.255.128.6

Upvotes: 10

Related Questions