Comencau
Comencau

Reputation: 1193

Impala - Replace all data in a table's partition

I have a program that generates all the data concerning a Impala table partition. This program writes the data in a HDFS Text file.

How to (physically) remove all the data previously belonging to the partition and replace them with the data in the new Text file converted in Parquet format ?

If I physically remove the old Parquet files composing the partition using raw HDFS API, is it going to disturb Impala ?

Upvotes: 1

Views: 7129

Answers (1)

fi11er
fi11er

Reputation: 679

Create table for your text files:

create external table stg_table (...) location '<your text file in hdfs>';

After external data change you have to refresh it:

refresh stg_table;

Then insert into you target table

insert overwrite table target_table select * from stg_table;

If your target table is partitioned, do this:

insert overwrite table target_table partiton(<partition spec>) select * from stg_table;

keyword 'overwrite' does the trick, it overwrites table or partition.

Upvotes: 4

Related Questions