questionasker
questionasker

Reputation: 2697

run Spark-Submit on YARN but Imbalance (only 1 node is working)

i try to run Spark Apps on YARN-CLUSTER (2 Nodes) but it seems those 2 nodes are imbalance because only 1 node is working but another one is not.

My Script :

spark-submit --class org.apache.spark.examples.SparkPi 
--master yarn-cluster --deploy-mode cluster --num-executors 2 
--driver-memory 1G 
--executor-memory 1G 
--executor-cores 2 spark-examples-1.6.1-hadoop2.6.0.jar 1000

I see one of my node is working but another is not, so this is imbalance :

enter image description here Note : in the left is namenode, and datanode is on the right...

Any Idea ?

Upvotes: 6

Views: 2135

Answers (3)

Harshit
Harshit

Reputation: 1293

The complete dataset could be local to one of the nodes, hence it might be trying to honour data locality. You can try the following config while launching spark-submit

--conf "spark.locality.wait.node=0"

The same worked for me.

Upvotes: 1

Bhavesh
Bhavesh

Reputation: 919

You can check on which node the executor are launched from SPARK UI

Spark UI gives the details of nodes where the execution are launched

Executor is the TAB in Spark UI

enter image description here

Upvotes: 0

banjara
banjara

Reputation: 3890

you are running job in yarn-cluster mode, in cluster mode Spark driver runs in the ApplicationMaster on a cluster host

try running it in yarn-client mode, in client mode Spark driver runs on the host where the job is submitted, so you will be able to see output on console

spark-submit --verbose --class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode client \
--num-executors 2 \
--driver-memory 1G \
--executor-memory 1G \
--executor-cores 2 spark-examples-1.6.1-hadoop2.6.0.jar 10

Upvotes: 0

Related Questions