Reputation: 408
I need to calculate the difference between the various interactions for a serial number in hive. My table has multiple serial numbers with with multiple interactions. I have ordered the interactions for a serial number based on its timestamp. I would now also be interested in adding a column with the difference in timestamps for consecutive rows.
For example, the last column in the image shows the expected output.
Appreciate your help.
Upvotes: 0
Views: 815
Reputation: 222462
You can use lag()
to access the "previous" row in a given partition:
select
t.*,
lag(timestamp) over(partition by sr_no order by timestamp) as lag_timestamp
from mytable t
Say you want the difference in seconds between the two timestamps, then:
select
t.*,
unix_timestamp(timestamp)
- unix_timestamp(lag(timestamp) over(partition by sr_no order by timestamp))
as diff_seconds
from mytable t
Upvotes: 1