Akhil Sudhakaran
Akhil Sudhakaran

Reputation: 11

Pyspark data frame to Hive Table

How to store a Pyspark DataFrame object to a hive table , "primary12345" is a hive table ? am using the below code masterDataDf is a data frame object

masterDataDf.write.saveAsTable("default.primary12345")

getting below error

: java.lang.RuntimeException: Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

Upvotes: 1

Views: 14297

Answers (1)

Sahil Desai
Sahil Desai

Reputation: 3696

You can create one temporary table.

masterDataDf.createOrReplaceTempView("mytempTable") 

Then you can use simple hive statement to create table and dump the data from your temp table.

sqlContext.sql("create table primary12345 as select * from mytempTable");

OR

if you want to used HiveContext you need to have/create a HiveContext

import org.apache.spark.sql.hive.HiveContext;

HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());

Then directly save dataframe or select the columns to store as hive table

masterDataDf.write().mode("overwrite").saveAsTable("default.primary12345 ");

Upvotes: 1

Related Questions