RachmaninovQuartet
RachmaninovQuartet

Reputation: 31

How to get spark-submit hive context running properly with tez and yarn?

I've been trying to get spark-submit to work with org.apache.spark.sql.hive.HiveContext, but I keep coming out with java.lang.NoClassDefFoundError: org/apache/tez/dag/api/SessionNotRunning. Here is the code, which breaks on the last line:

val sc = SparkContext.getOrCreate()
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

This is on Hortonworks 2.3.4, with spark 1.5.2, hive 1.2.1, hadoop 2.7.1, and tez 0.7.0. I'm using maven for all dependencies except data nucleus, I have hive-site.xml and tez-site.xml in the --files argument of spark-submit. Here is the tez related excerpt from my pom:

 <dependency>
        <groupId>org.apache.tez</groupId>
        <artifactId>tez-api</artifactId>
        <version>${tez.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.tez</groupId>
        <artifactId>tez-dag</artifactId>
        <version>${tez.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.tez</groupId>
        <artifactId>tez-common</artifactId>
        <version>${tez.version}</version>
    </dependency>

This code works properly in spark shell. Any advice?

Upvotes: 2

Views: 1571

Answers (1)

RachmaninovQuartet
RachmaninovQuartet

Reputation: 31

Following @user1314742's advice, I removed everything tez related from hive-site.xml. I included it in the --files argument to spark-submit, so as not to change my actual hive configs.

So put new hive-site.xml into your spark conf directory and try to remove tez and try again.. that should resolve the problem

Upvotes: 1

Related Questions