Baeumla
Baeumla

Reputation: 463

How to configure Hive to use Spark?

I have a problem using Hive on Spark. I have installed a single-node HDP 2.1 (Hadoop 2.4) via Ambari on my CentOS 6.5. I'm trying to run Hive on Spark, so I used this instructions:

https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

I already downloaded the "Prebuilt for Hadoop 2.4"-version of Spark, which i found on the official Apache Spark website. So I started the Master with:

./spark-class org.apache.spark.deploy.master.Master

Then the worker with:

./spark-class org.apache.spark.deploy.worker.Worker spark://hadoop.hortonworks:7077

And then I started Hive with this prompt:

hive –-auxpath /SharedFiles/spark-1.0.1-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar

Then, according to the instructions, i had to change the execution engine of hive to spark with this prompt:

set hive.execution.engine=spark;,

And the result is:

Query returned non-zero code: 1, cause: 'SET hive.execution.engine=spark' FAILED in validation : Invalid value.. expects one of [mr, tez].

So if I try to launch a simple Hive Query, I can see on my hadoop.hortonwork:8088 that the launched job is a MapReduce-Job.

Now to my question: How can I change the execution engine of Hive so that Hive uses Spark instead of MapReduce? Are there any other ways to change it? (I already tried to change it via ambari and at the hive-site.xml)

Upvotes: 4

Views: 34222

Answers (4)

Sree Eedupuganti
Sree Eedupuganti

Reputation: 224

change in hive configuration properties like this....

in $HIVE_HOME/conf/hive-site.xml

<property>
  <name>hive.execution.engine</name>
  <value>spark</value>
  <description>
    Chooses execution engine.
  </description>
</property>

Upvotes: 10

ROOT
ROOT

Reputation: 1775

in hive>conf>hive-site.xml set the value of hive.execution.engine to spark.

  </property> 
    <name>hive.execution.engine</name>
    <value>spark</value>
  </property>

Upvotes: 0

Venu A Positive
Venu A Positive

Reputation: 3062

set hive.execution.engine=spark; This is introduced in Hive 1.1+ onward. I think your hive version is older than Hive 1.1.

enter image description here Resource: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

Upvotes: 1

Sree Eedupuganti
Sree Eedupuganti

Reputation: 224

set hive.execution.engine=spark;

try this command it will run fine.

Upvotes: 5

Related Questions