rahul
rahul

Reputation: 1473

Hive : Insert overwrite multiple partitions

I have a Hive table partitioned on date. I want to be able to selectively overwrite the partitions for the last 'n' days (or custom list of partitions).

Is there a way to do it without writing "INSERT OVERWRITE DIRECTORY" statement for each partition?

Any help is greatly appreciated.

Upvotes: 11

Views: 25041

Answers (2)

Balaram Raju
Balaram Raju

Reputation: 41

Try this . This will exclude dt from * and then adding dt will set the order as required:

SET hive.support.quoted.identifiers=none;
INSERT OVERWRITE TABLE dst partition (dt)
SELECT `(dt)?+.+`, dt from tableName;

Upvotes: 1

libjack
libjack

Reputation: 6443

Hive supports dynamic partitioning, so you can build a query where the partition is just one of the source fields.

INSERT OVERWRITE TABLE dst partition (dt) 
SELECT col0, col1, ... coln, dt from src where ...

The where clause can specify which values of dt you want to overwrite.

Just include the partition field (dt in this case) last in the list from the source, you can even do SELECT *, dt if the dt field is already part of the source or even SELECT *,my_udf(dt) as dt, etc

By default, Hive wants at least one of the partitions specified to be static, but you can allow it to be nonstrict; so for the above query, you can set the following before the running:

set hive.exec.dynamic.partition.mode=nonstrict;

Upvotes: 25

Related Questions