Reputation: 157
I'm trying to run my spark application in a Hadoop cluster. The spark version running in the cluster is 1.3.1. I'm getting the error as posted below while packaging and running my spark application in a cluster. I looked at the other posts as well, seems like I'm messing up with the library dependencies, but couldn't figure out what!
Here are the other information that might be helpful for you guys to help me out:
hadoop -version:
Hadoop 2.7.1.2.3.0.0-2557
Subversion [email protected]:hortonworks/hadoop.git -r 9f17d40a0f2046d217b2bff90ad6e2fc7e41f5e1
Compiled by jenkins on 2015-07-14T13:08Z
Compiled with protoc 2.5.0
From source with checksum 54f9bbb4492f92975e84e390599b881d
This command was run using /usr/hdp/2.3.0.0-2557/hadoop/lib/hadoop-common-2.7.1.2.3.0.0-2557.jar
The error stack:
java.lang.NoSuchMethodError: org.apache.spark.sql.hive.HiveContext: method <init>(Lorg/apache/spark/api/java/JavaSparkContext;)V not found
at com.cyber.app.cyberspark_app.main.Main.main(Main.java:163)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:577)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:174)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
My pom.xml looks like this:
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>path.to.my.main.Main</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- bind to the packaging phase -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>1.6.1</version>
<scope>provided</scope>
</dependency>
</dependencies>
I'm using "mvn package" to package my jar.
EDIT:
I tried changing all the versions to 1.3.1. If I do this change, I need to change my application as I'm using the features that were available after 1.3.1.
But if I put all 1.6.1 compiled under Scala_2.10, I get the same error.
Please let me know if I need to provide any additional information. Any help will be greatly appreciated.
Thank you.
Upvotes: 1
Views: 1665
Reputation: 8996
It can be binary compatibility issues.
First, make sure that all your Spark dependencies are on Spark 1.3.1. I see that you have Spark SQL to be on 1.6.1.
Second, you are using Spark compiled on Scala 2.11. The typical distribution of Spark is compiled only on 2.10. Typically, if you want the 2.11 version you need to compile spark yourself.
If you are not sure if the Spark running on your cluster is compiled with Scala I would change all my dependencies to use "2.10" instead of "2.11" and try again.
Upvotes: 1