user2239884
user2239884

Reputation: 21

drop partitions dynamically in hive

I have two tables in hive, both partitioned by a String that represents a timestamp (I use a string instead of a timestamp since I'm using the tables in Cloudera Impala, that doesn't support tables partitioned by a timestamp).

The tables are used to store a lot of data in specific time slices. The first table contains the newest data in higher time granularity, lets say 1 Minute time slices and the second one older data in lower granularity, lets say 1 hour time slices here.

So I have a query that sums up the data that is older than a specific amount of time of the 1 minute time slices so that I have the data for 1 hour time slices and inserts it into my table with the 1 hour time slices.

After I have created the 1 hour time slices, I want to delete all 1 minute time slices that are contained in the new 1 hour time slices. And since the table is partitioned by my string representing the time, I can just drop the according partitions.

And know my actual question:

is it possible in hive to drop partitions somehow like

 ALTER TABLE oneMinSlices DROP IF EXISTS PARTITION(time < 'YYYY-MM-DD HH:MM:SS')

thanks in advance for your help.

ps: if you're wondering why I am doing this: the data is continuous growing and if we don't delete the 1 minute time slices, the table containing them is getting extremly big, causing our querys to be slower and needing a lot of space.

Upvotes: 2

Views: 16037

Answers (2)

Jordan Young
Jordan Young

Reputation: 356

As of Hive 0.9.0, you can use all comparators in ALTER TABLE .. DROP PARTITION statements. This means that your original statement should work. See: https://issues.apache.org/jira/browse/HIVE-2908.

Upvotes: 6

Balaswamy Vaddeman
Balaswamy Vaddeman

Reputation: 8530

You have to write a shell script for that .

in .hql file write below query

ALTER TABLE oneMinSlices DROP IF EXISTS PARTITION(time = '${hiveconf:timestamp}')

Now pass partition as command line argument dynamically.

hive -hiveconf timestamp=2013-04-22\ 05:12:20 -f hqlfilepath.hql

Last step is take to take the time stamp dynamically from shell.

Run above command till your condition is satisfied.

Upvotes: 1

Related Questions