craigtb
craigtb

Reputation: 667

Hive - Partition Column Equal to Current Date

I am trying to insert into a Hive table from another table that does not have a column for todays date. The partition I am trying to create is at the date level. What I am trying to do is something like this:

INSERT OVERWRITE TABLE table_2_partition 
PARTITION (p_date = from_unixtime(unix_timestamp() - (86400*2) ,  'yyyy-MM-dd'))
SELECT * FROM table_1;

But when I run this I get the following error:

"cannot recognize input near 'from_unixtime' '(' 'unix_timestamp' in constant"

If I query a table and make one of the columns that it work just fine. Any idea how to set the partition date to current system date in HiveQL?

Thanks in advance,

Craig

Upvotes: 5

Views: 4750

Answers (3)

I hope you are running a shell script and then you can store the current date in a variable. Then you create an empty table in Hive using beeline with just partition column. Once done then while inserting the data into partition table you can add that variable as your partition column and data will be inserted.

Upvotes: 0

avikm
avikm

Reputation: 802

Instead of using unix_timestamp() and from_unixtime() functions, current_date() can used to get current date in 'yyyy-MM-dd' format.

current_date() is added in hive 1.2.0. official documentation

revised query will be :

SET hive.exec.dynamic.partition.mode=nonstrict;

INSERT OVERWRITE TABLE table_2_partition 
PARTITION (p_date)
SELECT
    *
  , current_date()
FROM table_1;

Upvotes: 0

Jeremy Beard
Jeremy Beard

Reputation: 2725

What you want here is Hive dynamic partitioning. This allows the decision for which partition each record is inserted into be determined dynamically as the record is selected. In your case, that decision is based on the date when you run the query.

To use dynamic partitioning your partition clause has the partition field(s) but not the value. The value(s) that maps to the partition field(s) is the value(s) at the end of the SELECT, and in the same order.

When you use dynamic partitions for all partition fields you need to ensure that you are using nonstrict for your dynamic partition mode (hive.exec.dynamic.partition.mode).

In your case, your query would look something like:

SET hive.exec.dynamic.partition.mode=nonstrict;

INSERT OVERWRITE TABLE table_2_partition 
PARTITION (p_date)
SELECT
    *
  , from_unixtime(unix_timestamp() - (86400*2) ,  'yyyy-MM-dd')
FROM table_1;

Upvotes: 4

Related Questions