Grigory Skvortsov
Grigory Skvortsov

Reputation: 463

Pyspark does not display the hive database

I try to connect to hive database via pyspark, but can't see my database (only default)

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.4.5
      /_/

Using Python version 3.7.4 (default, Aug 13 2019 20:35:49)
SparkSession available as 'spark'.
>>> spark.sql('show databases')
DataFrame[databaseName: string]
>>> spark.sql('show databases').show()
+------------+
|databaseName|
+------------+
|     default|
+------------+

But if i do this command using hive I get the following:

hive> show databases;
OK
signals
default
test
Time taken: 0.973 seconds, Fetched: 3 row(s)
hive> 

What I should do to connect to me hive instance?

Upvotes: 0

Views: 1647

Answers (1)

SnigJi
SnigJi

Reputation: 1410

Please check whether you have configured spark to use hive metastore.

Go to SPARK_HOME/conf/hive-site.xml.
And check the following property, if it's not there add that.

<configuration>
  <property>
  <name>hive.metastore.uris</name>
    <!-- hostname must point to the Hive metastore URI in your cluster -->
    <value>thrift://hostname:port</value>
    <description>URI for client to contact metastore server</description>
  </property>
</configuration>

Note: If you don’t know hostname and port of your metastore, go to HIVE_HOME/conf/hive-site.xml. You can find those property there

Upvotes: 1

Related Questions