Muhammad Affan
Muhammad Affan

Reputation: 145

Hive Fails to Execute Spark Task: "Failed to create Spark client for Spark session"

I am trying to integrate Apache Spark with Hive in a multi-node cluster setup. My setup consists of the following machines:

Everything works fine, and I can even create a Spark session from my local environment to the production machines using "thrift://192.XXX.01.04:9863".

On my Hive machine (192.XXX.01.04), I start the required services using:

hive --service metastore
hive --service hiveserver2

Configurations

hive-env.sh

export HADOOP_HOME=/path/to/hadoop
export HIVE_CONF_DIR=/path/to/apache-hive-3.1.2-bin/conf

export SPARK_HOME=/path/to/Spark
export SPARK_JARS=$(echo $SPARK_HOME/jars/*.jar | tr ' ' ',')
export HIVE_AUX_JARS_PATH=$SPARK_JARS

hive-site.xml (Important properties)

<configuration>
    <property>
        <name>hive.metastore.uri</name>
        <value>thrift://192.XXX.01.04:9083</value>
    </property>
    <property>
        <name>hive.execution.engine</name>
        <value>spark</value>
    </property>
    <property>
        <name>spark.master</name>
        <value>yarn</value> <!-- or local[*] for local mode -->
    </property>
    <property>
        <name>spark.submit.deployMode</name>
        <value>client</value>
    </property>
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>
</configuration>

Spark JARs Included

Due to conflicts with Hive and Hadoop libraries, I only kept the essential Spark JARs, including:

 - spark-hive_2.12-3.4.2.jar
 - spark-hive-thriftserver_2.12-3.4.2.jar
 - spark-sql_2.12-3.4.2.jar
 - mysql-connector-j-8.0.31.jar

Issue

Whenever I run a simple query like:

SELECT COUNT(*) FROM my_table;

I get the following error:

Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session dd1bae4e-bbb9-440c-a29f-68968e6b0421)'
FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask.

What I've Tried

  1. Verified that Hive metastore and HiveServer2 are running properly.
  2. Checked that Spark is accessible from the Hive machine.
  3. Ensured that my hive-site.xml contains the correct spark.master and hive.execution.engine.
  4. Made sure my Spark JARs are properly configured.

Questions

Any help would be greatly appreciated!

Upvotes: 0

Views: 19

Answers (0)

Related Questions