Reputation: 745
I am using spark 2.1.0 version and trying to establish a connection with Hive tables. My hive data warehouse is in /user/hive/warehouse in hdfs, by listing contents of that folder i can see all the dbname.db folders in it.
After some research i found that i need to specify the spark.sql.warehouse.dir
in spark 2.x and i set it like this
val spark = SparkSession
.builder()
.appName("Spark Hive Example")
.config("spark.sql.warehouse.dir", "/user/hive/warehouse")
.enableHiveSupport()
.getOrCreate()
and now i am trying to print the databases
spark.sql("show databases").show()
but i am only seeing default databases,
+------------+
|databaseName|
+------------+
| default|
+------------+
So i there any way i can connect the spark to the existing hive database? is there anything i am missing here?
Upvotes: 4
Views: 4500
Reputation: 133
step one:
You should config like this under Custom spark2-defaults
:
step two: Write the following command from the command line:
import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession._
val hive = HiveWarehouseSession.session(spark).build()
hive.showDatabases().show()
Integrating Apache Hive with Spark and BI: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/integrating-hive/content/hive_configure_a_spark_hive_connection.html
HiveWarehouseSession API operations: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/integrating-hive/content/hive_hivewarehousesession_api_operations.html
Upvotes: 0
Reputation: 1
there is a hive-site.xml file in /usr/lib/hive/conf. copy this file to
/usr/lib/spark/conf then you will see other databases. please follow the below steps.
1.open hive console and create a new database hive>create database venkat;
2.close hive terminal
3.copy hive -site.xml file
sudo cp /usr/lib/hive/conf/hive-site.xml /usr/lib/spark/conf/hive-site.xml
4.check databases
sqlContext.sql("show databases").show();
I think it will helpful
Upvotes: 0
Reputation: 15317
Your hive-site.xml
should be in classpath. Check this post. If you are using maven project then you can keep this file in resources folder.
Another way to connect to hive is using metastore uri.
val spark = SparkSession
.builder()
.appName("Spark Hive Example")
.master("local[*]")
.config("hive.metastore.uris", "thrift://localhost:9083")
.enableHiveSupport()
.getOrCreate();
Upvotes: 5