Reputation: 317
The question is exactly what is specified in the title.
I want to start my driver program on 192.168.1.1
, but the fact is when I submit my spark application to yarn, yarn will choose a random machine to be the driver of my application.
Can I choose the driver manually in yarn cluster mode?
the dupilicated question won't work on yarn.
Upvotes: 1
Views: 950
Reputation: 351
Like Yaron replied before, with YARN as master you have two options:
If you select cluster mode then you let yarn manage where the driver is spawned, based on resource availability in Yarn. If you select client mode then the driver is spawned in the client process, on the server where you ran the spark-submit.
So, a solution for your problem should be to run the command
spark-submit --master yarn --deploy-mode client ...
on the machine you want the driver to be on.
Make sure that:
Upvotes: 2
Reputation: 10450
If you want to use a specific machine as the driver, you should use YARN Client mode
SPARK docs - launching spark on yarn:
There are two deploy modes that can be used to launch Spark applications on YARN. In cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
In YARN Client
mode - the driver runs in the client process (you can choose the driver machine, it is the machine which execute the spark-submit command)
In YARN Cluster
mode - the Spark driver runs inside an application master process which is managed by YARN on the cluster.
Upvotes: 0