Reputation: 2697
i try to run Spark Apps on YARN-CLUSTER (2 Nodes) but it seems those 2 nodes are imbalance because only 1 node is working but another one is not.
My Script :
spark-submit --class org.apache.spark.examples.SparkPi
--master yarn-cluster --deploy-mode cluster --num-executors 2
--driver-memory 1G
--executor-memory 1G
--executor-cores 2 spark-examples-1.6.1-hadoop2.6.0.jar 1000
I see one of my node is working but another is not, so this is imbalance :
Note : in the left is
namenode
, and datanode
is on the right...
Any Idea ?
Upvotes: 6
Views: 2135
Reputation: 1293
The complete dataset could be local to one of the nodes, hence it might be trying to honour data locality. You can try the following config while launching spark-submit
--conf "spark.locality.wait.node=0"
The same worked for me.
Upvotes: 1
Reputation: 919
You can check on which node the executor are launched from SPARK UI
Spark UI gives the details of nodes where the execution are launched
Executor is the TAB in Spark UI
Upvotes: 0
Reputation: 3890
you are running job in yarn-cluster
mode, in cluster mode Spark driver runs in the ApplicationMaster on a cluster host
try running it in yarn-client
mode, in client mode Spark driver runs on the host where the job is submitted, so you will be able to see output on console
spark-submit --verbose --class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode client \
--num-executors 2 \
--driver-memory 1G \
--executor-memory 1G \
--executor-cores 2 spark-examples-1.6.1-hadoop2.6.0.jar 10
Upvotes: 0