THIS USER NEEDS HELP
THIS USER NEEDS HELP

Reputation: 3266

Setting up dynamic allocation in Apache Spark?

I am following the instruction here for setting up dynamic allocation for YARN resource manager.

However, I am confused by step 3: Add this jar to the classpath of all NodeManagers in your cluster.

Does this mean go to each node server and add the path to shuffle.jar to PATH environment variable? export=$PATH:<loc-to-shuffle.jar>?

Upvotes: 7

Views: 1099

Answers (1)

Anupam Jain
Anupam Jain

Reputation: 476

Yarn classpath means that on all node managers, either set the yarn.application.classpath in yarn-site.xml which contains comma-separated list of CLASSPATH entries.

When this value is empty, the following default CLASSPATH for YARN applications would be used.

  • For Linux:
$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, $HADOOP_YARN_HOME/share/hadoop/yarn/*, $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*
  • For Windows:
%HADOOP_CONF_DIR%, %HADOOP_COMMON_HOME%/share/hadoop/common/*, %HADOOP_COMMON_HOME%/share/hadoop/common/lib/*, %HADOOP_HDFS_HOME%/share/hadoop/hdfs/*, %HADOOP_HDFS_HOME%/share/hadoop/hdfs/lib/*, %HADOOP_YARN_HOME%/share/hadoop/yarn/*, %HADOOP_YARN_HOME%/share/hadoop/yarn/lib/*

So put spark-<version>-yarn-shuffle.jar in one of the listed classpath directories defined in yarn.application.classpath or the default classpath directories.

You can also create the soft link of spark-<version>-yarn-shuffle.jar in one of the yarn classpath directories

Hope this helps...

Upvotes: 6

Related Questions