Reputation: 538
how to save a spark dataframe into one partition of a partitioned hive table?
raw_nginx_log_df.write.saveAsTable("raw_nginx_log")
the above way could overwrite the whole table but not a specific partition. although i can solve the problem by the following code , it is obviously not elegant.
raw_nginx_log_df.registerTempTable("tmp_table")
sql(s"INSERT OVERWRITE TABLE raw_nginx_log PARTITION (par= '$PARTITION_VAR')")
it seems that in stackoverflowc.com there is no similar questions asked ever before!
Upvotes: 1
Views: 6938
Reputation: 73
YourDataFrame.write.format("parquet").option("/pathHiveLocation").mode(SaveMode.Append).partitionBy("partitionCol").saveAsTable("YourTable")
For parquet files/tables. You may customize it as per your requirement.
Upvotes: 2