Reputation: 1653
In SPARK-SUBMIT , what is the difference between "yarn" , "yarn-cluster" , "yarn-client" deploy modes ?
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn-cluster \ # can also be `yarn-client` for client mode
--executor-memory 20G \
--num-executors 50 \
/path/to/examples.jar \
1000
https://spark.apache.org/docs/1.1.0/submitting-applications.html
Upvotes: 5
Views: 15748
Reputation: 5202
For Spark on YARN, you can specify either yarn-client or yarn-cluster. Yarn-client runs driver program in the same JVM as spark submit, while yarn-cluster runs Spark driver in one of NodeManager's container.
From the documentation: https://spark.apache.org/docs/1.1.0/running-on-yarn.html There are two deploy modes that can be used to launch Spark applications on YARN. In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In yarn-client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
Upvotes: 13