Reputation: 3542
I am trying to run a Hive QUERy I have a table with lets say 3 columns. Of one is a date column with data as :
a d 2014-04-01
b e 2014-04-03
c f 2014-04-20
Now I want to pick the Maximum date from the above data and do a difference with the current date ( current date lets assume is 2014-04-24) and add the difference to the output. What I mean is ; the query should pick 2014-04-20 and subtract it with current date to give an output as 4 and then add this difference to all the dates to have an output as :
a d 2014-04-05
b e 2014-04-07
c f 2014-04-24
I tried this but it runs into a semantic issue:
select A, B, date_add( SOMEDATE, datediff(to_date( FROM_UNIXTIME(UNIX_TIMESTAMP() )), max(SOMEDATE))) As SOMEDATE
Upvotes: 0
Views: 1111
Reputation: 165
Doable using Hive date udf's (DATEDIFF, FROM_UNIXTIME, UNIX_TIMESTAMP, DATE_ADD): https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
Assuming your source table definition is: DateSource(col1 string, col2 string, myDate string)
The query would be:
SELECT col1, col2, myDate, DATE_ADD(myDate,daysDiff) as adjustedDate
FROM DateSource
JOIN
(
SELECT DATEDIFF(FROM_UNIXTIME(UNIX_TIMESTAMP(),"yyyy-MM-dd"),maxDate) as daysDiff
FROM
(
SELECT max(myDate) as maxDate FROM DateSource
) maxDate
) diffDate;
Upvotes: 2