Kring
Kring

Reputation: 133

Spark 2.1 - Error While instantiating HiveSessionState

With a fresh install of Spark 2.1, I am getting an error when executing the pyspark command.

Traceback (most recent call last):
File "/usr/local/spark/python/pyspark/shell.py", line 43, in <module>
spark = SparkSession.builder\
File "/usr/local/spark/python/pyspark/sql/session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/local/spark/python/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"

I have Hadoop and Hive on the same machine. Hive is configured to use MySQL for the metastore. I did not get this error with Spark 2.0.2.

Can someone please point me in the right direction?

Upvotes: 9

Views: 35239

Answers (10)

Sandeep Sompalle
Sandeep Sompalle

Reputation: 1

Project location and file permissions would be issue. I have observed this error happening inspite of changes to my pom file.Then i changed the directory of my project to user directory where i have full permissions, this solved my issue.

Upvotes: 0

Saurabh Sinha
Saurabh Sinha

Reputation: 1373

I have removed ".enableHiveSupport()\" from shell.py file and its working perfect

/*****Before********/ spark = SparkSession.builder\ .enableHiveSupport()\ .getOrCreate()

/*****After********/

spark = SparkSession.builder\ .getOrCreate()

/*************************/

Upvotes: 0

llevar
llevar

Reputation: 785

I was getting this error trying to run pyspark and spark-shell when my HDFS wasn't started.

Upvotes: 0

Vivek Jain
Vivek Jain

Reputation: 23

I too was struggling in cluster mode. Added hive-site.xml from sparkconf directory, if you have hdp cluster then it should be at /usr/hdp/current/spark2-client/conf. Its working for me.

Upvotes: 0

Deepak
Deepak

Reputation: 4938

I saw this error on a new (2018) Mac, which came with Java 10. The fix was to set JAVA_HOME to Java 8:

export JAVA_HOME=`usr/libexec/java_home -v 1.8`

Upvotes: 0

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41987

The issue for me was solved by disabling HADOOP_CONF_DIR environment variable. It was pointing to hadoop configuration directory and while starting pyspark shell, the variable caused spark to initiate hadoop cluster which wasn't initiated.

So if you have HADOOP_CONF_DIR variable enabled, then you have to start hadoop cluster started before using spark shells

Or you need to disable the variable.

Upvotes: 3

Nim J
Nim J

Reputation: 1033

I was getting same error in windows environment and Below trick worked for me.

in shell.py the spark session is defined with .enableHiveSupport()

 spark = SparkSession.builder\
            .enableHiveSupport()\
            .getOrCreate()

Remove hive support and redefine spark session as below:

spark = SparkSession.builder\
        .getOrCreate()

you can find shell.py in your spark installation folder. for me it's in "C:\spark-2.1.1-bin-hadoop2.7\python\pyspark"

Hope this helps

Upvotes: 17

user3542930
user3542930

Reputation: 557

You are missing the spark-hive jar.

For example, if you are running on Scala 2.11, with Spark 2.1, you can use this jar.

https://mvnrepository.com/artifact/org.apache.spark/spark-hive_2.11/2.1.0

Upvotes: 0

marilena.oita
marilena.oita

Reputation: 994

I had the same problem. Some of the answers sudo chmod -R 777 /tmp/hive/, or to downgrade spark with hadoop to 2.6 didn't work for me. I realized that what caused this problem for me is that I was doing SQL queries using the sqlContext instead of using the sparkSession.

sparkSession =SparkSession.builder.master("local[*]").appName("appName").config("spark.sql.warehouse.dir", "./spark-warehouse").getOrCreate()
sqlCtx.registerDataFrameAsTable(..)
df = sparkSession.sql("SELECT ...")

this perfectly works for me now.

Upvotes: 12

Vjender M
Vjender M

Reputation: 111

Spark 2.1.0 - When I run it with yarn client option - I don't see this issue, but yarn cluster mode gives "Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':".

Still looking for answer.

Upvotes: 4

Related Questions