Apache Spark dependency issue

Question

I'm trying to run my spark application in a Hadoop cluster. The spark version running in the cluster is 1.3.1. I'm getting the error as posted below while packaging and running my spark application in a cluster. I looked at the other posts as well, seems like I'm messing up with the library dependencies, but couldn't figure out what!

Here are the other information that might be helpful for you guys to help me out:

hadoop -version:

Hadoop 2.7.1.2.3.0.0-2557
Subversion git@github.com:hortonworks/hadoop.git -r          9f17d40a0f2046d217b2bff90ad6e2fc7e41f5e1
Compiled by jenkins on 2015-07-14T13:08Z
Compiled with protoc 2.5.0
From source with checksum 54f9bbb4492f92975e84e390599b881d
This command was run using /usr/hdp/2.3.0.0-2557/hadoop/lib/hadoop-common-2.7.1.2.3.0.0-2557.jar

The error stack:

java.lang.NoSuchMethodError: org.apache.spark.sql.hive.HiveContext: method (Lorg/apache/spark/api/java/JavaSparkContext;)V not found
at com.cyber.app.cyberspark_app.main.Main.main(Main.java:163)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:577)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:174)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

My pom.xml looks like this:


    
        
            maven-assembly-plugin
            
                
                    
                        path.to.my.main.Main
                    
                
                
                    jar-with-dependencies
                
            
            
                
                    make-assembly 
                    package 
                    
                        single
                    
                
            
        
    


    
        junit
        junit
        3.8.1
        test
    
     
        org.apache.spark
        spark-core_2.11
        1.3.1
    
    
        org.apache.spark
        spark-sql_2.11
        1.6.1
    
    
        org.apache.spark
        spark-hive_2.11
        1.6.1
        provided

I'm using "mvn package" to package my jar.

EDIT:

I tried changing all the versions to 1.3.1. If I do this change, I need to change my application as I'm using the features that were available after 1.3.1.
But if I put all 1.6.1 compiled under Scala_2.10, I get the same error.

Please let me know if I need to provide any additional information. Any help will be greatly appreciated.

Thank you.

marios · Accepted Answer

It can be binary compatibility issues.

First, make sure that all your Spark dependencies are on Spark 1.3.1. I see that you have Spark SQL to be on 1.6.1.

Second, you are using Spark compiled on Scala 2.11. The typical distribution of Spark is compiled only on 2.10. Typically, if you want the 2.11 version you need to compile spark yourself.

If you are not sure if the Spark running on your cluster is compiled with Scala I would change all my dependencies to use "2.10" instead of "2.11" and try again.

Apache Spark dependency issue

Answers (1)

Related Questions