Varun Gupta
Varun Gupta

Reputation: 1457

Convert string to timestamp hive

I have string with timestamp value such as "2013-02-02T04:04:11.240-4:00".

So goal is to convert in the timestamp format like 2013-02-02 04:04:11. How can I do this.

This command works fine

from_unixtime(unix_timestamp(substr('20130502081559999',1,14), 'yyyyMMddHHmmss'))

but when I try something like this it give error with 'T'

from_unixtime(unix_timestamp(substr('2013-05-02T08:15:59.999-4:00',1,19), 'yyyy-MM-ddTHH:mm:ss'))

Upvotes: 1

Views: 12438

Answers (1)

Mike W.
Mike W.

Reputation: 131

For everything in your string except the time zone offset, this pattern will work:

from_unixtime(unix_timestamp("2013-02-02T04:04:11.240", "yyyy-MM-dd'T'HH:mm:ss.SSS"))

The patterns for the SimpleDateFormat are here.

The additional pattern XXX should handle your time zone offset. But as noted on this page and this page, the X pattern was introduced in Java 7.

If you're running an earlier version of Java (like me - my Cloudera QuickStart VM uses 1.6.0_32), you'll have to do some additional manipulation to get this to work. One way to do this is:

from_unixtime(unix_timestamp( 
    regexp_replace(
        regexp_replace("2013-02-02T04:04:11.240-4:00", "(.*):(.{2})$", "$1$2"),
    "(.*)-(.{3})$", "$1-0$2"), 
"yyyy-MM-dd'T'HH:mm:ss.SSSZ"))

The innermost regexp_replace strips the colon (:) out of the time zone offset. The outermost regexp_replace pads the time zone offset with an extra zero, if needed (i.e. -4:00 becomes -0400, which matches the Z pattern. Note that -400 does not match the Z pattern).

Upvotes: 5

Related Questions