Reputation: 31
I've been trying to get spark-submit to work with org.apache.spark.sql.hive.HiveContext, but I keep coming out with java.lang.NoClassDefFoundError: org/apache/tez/dag/api/SessionNotRunning. Here is the code, which breaks on the last line:
val sc = SparkContext.getOrCreate()
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
This is on Hortonworks 2.3.4, with spark 1.5.2, hive 1.2.1, hadoop 2.7.1, and tez 0.7.0. I'm using maven for all dependencies except data nucleus, I have hive-site.xml and tez-site.xml in the --files argument of spark-submit. Here is the tez related excerpt from my pom:
<dependency>
<groupId>org.apache.tez</groupId>
<artifactId>tez-api</artifactId>
<version>${tez.version}</version>
</dependency>
<dependency>
<groupId>org.apache.tez</groupId>
<artifactId>tez-dag</artifactId>
<version>${tez.version}</version>
</dependency>
<dependency>
<groupId>org.apache.tez</groupId>
<artifactId>tez-common</artifactId>
<version>${tez.version}</version>
</dependency>
This code works properly in spark shell. Any advice?
Upvotes: 2
Views: 1571
Reputation: 31
Following @user1314742's advice, I removed everything tez related from hive-site.xml. I included it in the --files argument to spark-submit, so as not to change my actual hive configs.
So put new hive-site.xml into your spark conf directory and try to remove tez and try again.. that should resolve the problem
Upvotes: 1