run Spark-Submit on YARN but Imbalance (only 1 node is working)

i try to run Spark Apps on YARN-CLUSTER (2 Nodes) but it seems those 2 nodes are imbalance because only 1 node is working but another one is not.

My Script :

spark-submit --class org.apache.spark.examples.SparkPi 
--master yarn-cluster --deploy-mode cluster --num-executors 2 
--driver-memory 1G 
--executor-memory 1G 
--executor-cores 2 spark-examples-1.6.1-hadoop2.6.0.jar 1000

I see one of my node is working but another is not, so this is imbalance :

Note : in the left is namenode, and datanode is on the right...

Any Idea ?

Upvotes: 6

Answers (3)

Harshit

Reputation: 1293

The complete dataset could be local to one of the nodes, hence it might be trying to honour data locality. You can try the following config while launching spark-submit

--conf "spark.locality.wait.node=0"

The same worked for me.

Upvotes: 1

Bhavesh

Reputation: 919

You can check on which node the executor are launched from SPARK UI

Spark UI gives the details of nodes where the execution are launched

Executor is the TAB in Spark UI

Upvotes: 0

banjara

Reputation: 3890

you are running job in yarn-cluster mode, in cluster mode Spark driver runs in the ApplicationMaster on a cluster host

try running it in yarn-client mode, in client mode Spark driver runs on the host where the job is submitted, so you will be able to see output on console

spark-submit --verbose --class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode client \
--num-executors 2 \
--driver-memory 1G \
--executor-memory 1G \
--executor-cores 2 spark-examples-1.6.1-hadoop2.6.0.jar 10

Upvotes: 0

run Spark-Submit on YARN but Imbalance (only 1 node is working)

Answers (3)

Related Questions