Reputation: 1530
I am struggling to find a way to rank the events using SQL.
The goal is to increment the rank whenever an event occurred more than delta
seconds (e.g. 1 second) from the previous observation. So far my attempt is shown below:
select a.event_time, a.user_name, a.object_name, a.rnk, case when a.ddif <= 1000 then 0 else 1 end as new_query,
case when a.ddif <= 1000 then 0 else rnk end as new_rnk
from (
select *, rank() OVER (PARTITION BY user_name ORDER BY event_time) AS rnk,
date_diff('second',lag(event_time) OVER (PARTITION BY user_name ORDER BY event_time),event_time) as ddif
from tmp
) a
But it gives me just the following results and I still don't know how to achieve the results in yellow
(either of them works for me perfectly).
I'd appreciate any help with that.
Please note: I am using Presto DB, hence I am limited to this query engine.
Upvotes: 2
Views: 506
Reputation: 1270391
Use lag()
and a cumulative sum to define the groups. Then assign row numbers:
select t.*,
row_number() over (partition by user_name, grp order by event_time) as seqnum
from (select t.*,
sum(case when prev_et > event_time - interval '1' second
then 0 else 1
end) over (partition by user_name order by event_time) as grp
from (select t.*,
lag(event_time) over (partition by user_name order by event_time) as prev_et
from tmp t
) t
) t;
Upvotes: 1
Reputation: 1530
Thanks for all the good tips that pointed me the direction to the final solution which is:
select a.*, sum (case when a.ddif <= 1 then 0 else 1 end) over (partition by user_name order by event_time) as acc_rnk
from (
select *, date_diff('second',lag(event_time) OVER (PARTITION BY user_name ORDER BY event_time),event_time) as ddif
from tmp
) a
Upvotes: 1
Reputation: 222582
You can use lag()
and a window sum()
:
select
t.*,
sum(case when event_time <= lag_event_time + interval '1' second then 0 else 1 end) rnk
from (
select
t.*,
lag(event_time) over(order by event_time partition by user_name) lag_event_time
from mytable t
) t
Upvotes: 1