Reputation: 1193
I have a program that generates all the data concerning a Impala table partition. This program writes the data in a HDFS Text file.
How to (physically) remove all the data previously belonging to the partition and replace them with the data in the new Text file converted in Parquet format ?
If I physically remove the old Parquet files composing the partition using raw HDFS API, is it going to disturb Impala ?
Upvotes: 1
Views: 7129
Reputation: 679
Create table for your text files:
create external table stg_table (...) location '<your text file in hdfs>';
After external data change you have to refresh it:
refresh stg_table;
Then insert into you target table
insert overwrite table target_table select * from stg_table;
If your target table is partitioned, do this:
insert overwrite table target_table partiton(<partition spec>) select * from stg_table;
keyword 'overwrite' does the trick, it overwrites table or partition.
Upvotes: 4