Reputation: 91
Want to write spark dataframe into existing parquet hive table. I am able to do it usingdf.write.mode("append").insertI to("myexistinghivetable")
but if I check through file system I could see spark files are landed with .c000 extension.
What those files mean? And how to write dataframe into parquet hive table.
Upvotes: 3
Views: 9409
Reputation: 91
We can do it using df.write.partitionBy("mypartitioncols").format("parquet").mode(SaveMode.Append).saveAsTable("hivetable")
In earlier version of spark save mode append was not there.
Upvotes: 3
Reputation: 1892
You can save dataframe as parquest at location where your hive table is referring after that you can alter tables in hive
You can do like this
df.write.mode("append").parquet("HDFS directory path")
Upvotes: 3