Reputation: 65
I am trying to identify repeating groups of data and have been having hard time thought it sounds simple.
Following is my Dataset.
ID, TIME, Speed
1, 10, 0
1, 11, 0
1, 12, 0
1, 12, 1
1, 13, 2
1, 14, 0
1, 15, 0
1, 16, 4
I am trying to accomplish the Min and Max of Repeating Zero speed. In this Case
ID, Min, Max, Value
1 , 10 , 12, 0
1, 14, 15, 0
I have tried Lead / Lag functions but am able to get each individual row. But not sets of zeros. I am trying to get this in Hive.
Any Help is greatly Appreciated.
Upvotes: 1
Views: 891
Reputation: 1270993
You can do this with a difference of row numbers:
select id, speed, min(time), max(time)
from (select t.*,
(row_number() over (partition by id order by time) -
row_number() over (partition by id, speed order by time)
) as grp
from table t
) t
where speed = 0
group by id, speed, grp;
The difference of row numbers is constant for when adjacent values are the same.
Upvotes: 2