Valerie
Valerie

Reputation: 25

How to convert timestamp from different timezones in Hive

I am querying a table in Hive with json payloads and am extracting the timestamp from these payloads. the problem is that timestamps are present in different timezone formats and I'm trying to extract them all in my timezone.

I am currently using the following:

select
from_unixtime(unix_timestamp(get_json_object (table.payload, '$.timestamp'), "yyyy-MM-dd'T'HH:mm:ss.SSSXXX"))
FROM table

This is returning the correct values if the timestamp is in this format: 2018-08-16T08:54:05.543Z --> 2018-08-16 18:54:05 (changed format and converted into my timezone)

However the query above returns 'null' if the payload contains the timestamp in this format: 2018-09-13T01:35:08.460+0000 2018-09-13T11:35:09+10:00

How can I adjust my query to work for all types of timestamps all converting to proper timezone (+10 is my timezone!) and all in the same format?

Thanks in advance!

Upvotes: 0

Views: 4496

Answers (2)

selitsky
selitsky

Reputation: 44

Without regexp use Z for +1000 of XXX for +10:00 :

select unix_timestamp('2016-07-30T10:29:33.000+03:00', "yyyy-MM-dd'T'HH:mm:ss.SSSXXX") as t1
select unix_timestamp('2016-07-30T10:29:33.000+0300', "yyyy-MM-dd'T'HH:mm:ss.SSSZ") as t2

Full docs about time formats: https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html

Upvotes: -1

serge_k
serge_k

Reputation: 1772

How about the following macro:

create temporary macro extract_ts(ts string) 
  from_unixtime(unix_timestamp(regexp_extract(ts, '(.*)\\+(.*)', 1), "yyyy-MM-dd'T'HH:mm:ss") + 3600*cast(regexp_extract(ts, '(.*)\\+(.*)\\:(.*)', 2) as int));

e.g.,

hive> select extract_ts('2018-09-13T11:35:09+10:00');
OK
2018-09-13 21:35:09

Upvotes: 0

Related Questions