Hive date format handling

I am handling JSON data containing a date as per this example 'MON 2014-01-03 13:00:00 +GMT0000'

I need to compare records on date.

Is it best to load as strings and manipulate as and when required?

A requirement will be to select the highest and lowest dates for a particular criteria, and calculate the difference in seconds.

Thanks for looking.

Upvotes: 0

Views: 3164

Answers (1)

sayan dasgupta
sayan dasgupta

Reputation: 1082

Best solution for your problem is to use unixtimestamp (seconds since standard epoch of 1/1/1970) Following is an example query, as to how you parse the timestamp-strings to unixtimestamp.

 select unix_timestamp(REGEXP_REPLACE('MON 2014-01-03 13:00:00 +GMT0000','GMT',''),
                      "EEE yyyy-MM-dd HH:mm:ss Z") as unixtime from reqtable;

You will have more details here https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions

Also you should take a look into Java SimpleDateFormat to match the exact timestamp string pattern.

Upvotes: 1

Related Questions