Jack
Jack

Reputation: 5890

No class found when ran Spark on Yarn

The same code can be run on Spark standalone, but it failed on Yarn when I ran spark on Yarn. The exception was: java.lang.NoClassDefFoundError: Could not initialize class org.elasticsearch.common.xcontent.json.JsonXContent which was threw in Executor(Yarn Container). But I did included the elasticSearch jar in the application assembly jar when I used maven assembly. The run command as following:

spark-submit --executor-memory 10g --executor-cores 2 --num-executors 2 
--queue thejob --master yarn --class com.batch.TestBat /lib/batapp-mr.jar 2016-12-20

The maven dependencies as following please:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-hive_2.10</artifactId>
    <version>1.6.0</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-mllib_2.10</artifactId>
    <version>1.6.0</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.6.0</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>1.6.0</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-catalyst_2.10</artifactId>
    <version>1.6.0</version>
    <scope>provided</scope>
</dependency>

<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-core</artifactId>
    <version>2.6.3</version>
    <!-- <scope>provided</scope> -->
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>1.2.0-cdh5.7.0</version>
    <!--<scope>provided</scope> -->
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-server</artifactId>
    <version>1.2.0-cdh5.7.0</version>
    <!--<scope>provided</scope> -->
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-protocol</artifactId>
    <version>1.2.0-cdh5.7.0</version>
    <!--<scope>provided</scope> -->
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-hadoop2-compat</artifactId>
    <version>1.2.0-cdh5.7.0</version>
    <!--<scope>provided</scope> -->
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-common</artifactId>
    <version>1.2.0-cdh5.7.0</version>
    <!--<scope>provided</scope> -->
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-hadoop-compat</artifactId>
    <version>1.2.0-cdh5.7.0</version>
    <!--<scope>provided</scope> -->
</dependency>


<dependency>
    <groupId>com.sksamuel.elastic4s</groupId>
    <artifactId>elastic4s-core_2.10</artifactId>
    <version>2.3.0</version>
    <!--<scope>provided</scope> -->
    <exclusions>
        <exclusion>
            <artifactId>elasticsearch</artifactId>
            <groupId>org.elasticsearch</groupId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>2.3.2</version>
</dependency>
<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch-hadoop</artifactId>
    <version>2.3.1</version>
    <exclusions>
        <exclusion>
            <artifactId>log4j-over-slf4j</artifactId>
            <groupId>org.slf4j</groupId>
        </exclusion>
    </exclusions>
</dependency>

The weird thing is that the Executor could find Hbase jar and ElasticSearch jar which both included in dependencies, but not ElasticSearch some classes, So I guess might some class conflicts. I checked the assembly jar it did included the "missing classes".

Upvotes: 2

Views: 501

Answers (1)

Ram Ghadiyaram
Ram Ghadiyaram

Reputation: 29237

I can see, You have already included the jar dependency. Also you have commented dependency provided means it will be packed and same thing is available for your deployment.

<dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-core</artifactId>
            <version>2.6.3</version>
        </dependency>

Only thing I suspect/sure is spark submit please check like below.

--conf "spark.driver.extraLibrayPath=$HADOOP_HOME/*:$HBASE_HOME/*:$HADOOP_HOME/lib/*:$HBASE_HOME/lib/htrace-core-3.1.0-incubating.jar:$HDFS_PATH/*:$SOLR_HOME/*:$SOLR_HOME/lib/*" \
        --conf "spark.executor.extraLibraryPath=$HADOOP_HOME/*" \
--conf "spark.driver.extraClassPath=$(echo /your directory of jars/*.jar | tr ' ' ',')
            --conf "spark.executor.extraClassPath=$(echo /your directory of jars/*.jar | tr ' ' ',')

where your directory of jars is extracted lib from your distribution.
You can also print Classpath like below from your program

val cl = ClassLoader.getSystemClassLoader  
 cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(pri‌​ntln)

EDIT : after executing above lines, if you find old duplicate jar present in your class path, then include your libraries with your app or using --jars, but also try setting spark.{driver,executor}.userClassPathFirst totrue

Upvotes: 2

Related Questions