Reputation: 25
I am querying a table in Hive with json payloads and am extracting the timestamp from these payloads. the problem is that timestamps are present in different timezone formats and I'm trying to extract them all in my timezone.
I am currently using the following:
select
from_unixtime(unix_timestamp(get_json_object (table.payload, '$.timestamp'), "yyyy-MM-dd'T'HH:mm:ss.SSSXXX"))
FROM table
This is returning the correct values if the timestamp is in this format: 2018-08-16T08:54:05.543Z --> 2018-08-16 18:54:05 (changed format and converted into my timezone)
However the query above returns 'null' if the payload contains the timestamp in this format: 2018-09-13T01:35:08.460+0000 2018-09-13T11:35:09+10:00
How can I adjust my query to work for all types of timestamps all converting to proper timezone (+10 is my timezone!) and all in the same format?
Thanks in advance!
Upvotes: 0
Views: 4496
Reputation: 44
Without regexp use Z for +1000 of XXX for +10:00 :
select unix_timestamp('2016-07-30T10:29:33.000+03:00', "yyyy-MM-dd'T'HH:mm:ss.SSSXXX") as t1
select unix_timestamp('2016-07-30T10:29:33.000+0300', "yyyy-MM-dd'T'HH:mm:ss.SSSZ") as t2
Full docs about time formats: https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html
Upvotes: -1
Reputation: 1772
How about the following macro:
create temporary macro extract_ts(ts string)
from_unixtime(unix_timestamp(regexp_extract(ts, '(.*)\\+(.*)', 1), "yyyy-MM-dd'T'HH:mm:ss") + 3600*cast(regexp_extract(ts, '(.*)\\+(.*)\\:(.*)', 2) as int));
e.g.,
hive> select extract_ts('2018-09-13T11:35:09+10:00');
OK
2018-09-13 21:35:09
Upvotes: 0