connecttopawan
connecttopawan

Reputation: 218

Difference between unix_timestamp and casting to timestamp

I am having a situation for a hive table, to convert a two fields of numeric string (T1 and T2) to date timestamp format "YYYY-MM-DD hh:mm:ss.SSS" and to find difference of both. I have tried two methods:

Select CAST(regexp_replace(substring(t1, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp), CAST(regexp_replace(substring(t2, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp), CAST(regexp_replace(substring(t1, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp) - CAST(regexp_replace(substring(t2, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp) as time_diff
from tab1

And getting output as

enter image description here

Select from_unixtime (unix_timestamp(substring(t1,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS'), from_unixtime (unix_timestamp(substring(t2,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS'), from_unixtime (unix_timestamp(substring(t1,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS') - from_unixtime (unix_timestamp(substring(t2,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS') as time_diff
from tab1;

And getting output as

enter image description here

I am not getting clear why there is difference in outputs.

Upvotes: 0

Views: 1379

Answers (2)

AbhishekB
AbhishekB

Reputation: 31

unix_timestamp() gives you epoch time ie. time in seconds since unix epoch 1970-01-01 00:00:00 Whereas the the timestamp will provide date and time viz YYYY-MM-DD T HH:MI:SS Hence an accurate way would be to convert the string timestamp to unix_timestamp(), subtract and then convert back using from_unixtime() eg.

select from_unixtime(unix_timestamp('2020-04-12 01:30:02.000') - unix_timestamp('2020-04-12 01:29:43.000'))

Upvotes: 2

Ganesh Chandrasekaran
Ganesh Chandrasekaran

Reputation: 1936

Method 2 finally equates to something like this

select ('2020-04-12 01:30:02.000' - '2020-04-12 01:29:43.000') as time_diff;

You cannot subtract dates like this.. you have to use DateDiff.

In Hive DateDiff returns > 0 only if there is a diff in day else you get zero.

Upvotes: 1

Related Questions