Reputation: 891
Difference between Spark-SQL and Hive on Spark. I am going through the documentation of spark and sql and trying to understand the difference between Spark-SQL and HIVE on Spark.
hive-site.xml
and then persist a table in my spark program, where will the data and metadata be stored. Will spark create a new Hive Metastore (like derby)?hive-ste.xml
and making spark aware of existing hive. Then if I persist the table will data and metadata be stored in my existing Hive Metastore and Data in Warehouse directory of HDFS.Thanks.
Upvotes: 1
Views: 1004
Reputation: 49
When you initiate a spark session, the data can be stored in S3 or HDFS.It will not inherently create a Hive session without you explicitly creating so.
Yes if you use the 'saveastable' clause referencing a Hive table. the data will be persisted within the HDFS. Bear in mind if you drop the HDFS instance such as in EMR the table will be dropped along with its data.
Not sure about question # 3
Upvotes: 0