Elad Eldor
Elad Eldor

Reputation: 831

Hive on Tez doesn't work in Spark 2

when working with HDP 2.5 with spark 1.6.2 we used Hive with Tez as its execution engine and it worked.

But when we moved to HDP 2.6 with spark 2.1.0, Hive didn't work with Tez as its execution engine, and the following exception was thrown when the DataFrame.saveAsTable API was called:

java.lang.NoClassDefFoundError: org/apache/tez/dag/api/SessionNotRunning at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:529) at org.apache.spark.sql.hive.client.HiveClientImpl.<init> HiveClientImpl.scala:188)

after looking at the answer to this question, we switched hive execution engine to MR (MapReduce) instead of Tez and it worked.

However, we'd like to work with Hive on Tez. what's required to solve the above exception in order for Hive on Tez to work?

Upvotes: 3

Views: 3918

Answers (1)

Vijayanand
Vijayanand

Reputation: 500

I had the same issue when the spark job was running in YARN CLUSTER mode and that was resolved when correct hive-site.xml was added to ( add to spark-default configuration) " spark.yarn.dist.files "

Basically there are two different hive-site.xml files, one is for hive configuration : /usr/hdp/current/hive-client/conf/hive-site.xml The other one is lighter version for spark ( had the details only for spark to work with hive) : /etc/spark//0/hive-site.xml ( please check the path once for your setup)

we need to use the second file for spark.yarn.dist.files.

Upvotes: 1

Related Questions