Reputation: 43
I have a table called activity
that contains values like the following:
userId | timestamp | action
----------------------------------------
1 | 2022-10-18 10:00:00 | OPEN
2 | 2022-10-18 10:20:00 | OPEN
1 | 2022-10-18 10:05:00 | CLOSE
2 | 2022-10-18 10:22:00 | CLOSE
...
I want to be able to get the difference between the open and close time per user for a given day like so:
desiredTable
------------------------------------------------------
userId | start_time | time_elapsed_in_minutes
------------------------------------------------------
1 | 2022-10-18 10:00:00 | 5
2 | 2022-10-18 10:20:00 | 2
A couple of things to note are:
OPEN
and CLOSE
rows will be back to back to each other as the table also holds a lot of other action types.OPEN
and CLOSE
rows due to network conditions that could result in either not being reported. I.e: user 1 can have 3 opens and 1 close, so only 1 pair needs to be calculated.My approach:
userId | # opens | # closes
userId | timestamp
userId | timestamp
Any help would be greatly appreciated!
Upvotes: 1
Views: 144
Reputation: 25988
As suggested by @Robert Hamilton this can be directly translated into a query using window functions:
select userId,
start_time,
floor(extract('epoch' from duration)/60) duration_in_minutes
from (
select userId,
timestamp as start_time,
case when action='CLOSE' and (lag(action) over w1)='OPEN'
then timestamp - (lag(timestamp) over w1)
end as duration
from activity
window w1 as (partition by userId order by timestamp)) a
where duration is not null;
But the same thing can be achieved using very basic constructs - I used CTEs below only for readability:
with
"nearest_close_after_each_open" as
( select
open.userId,
open.timestamp as start_time,
min(close.timestamp) as end_time
from activity as close
inner join activity as open
on open.action='OPEN' and close.action='CLOSE'
and open.userId=close.userId
and open.timestamp < close.timestamp
group by 1,2),
"longest_open_window_before_a_close" as
( select
userId,
end_time,
min(start_time) as start_time
from nearest_close_after_each_open
group by 1,2),
"shortest_open_window_before_a_close" as
( select
userId,
end_time,
max(start_time) as start_time
from nearest_close_after_each_open
group by 1,2)
select
userId,
start_time,
end_time,
floor(extract('epoch' from end_time-start_time )/60) as time_elapsed_in_minutes
from "shortest_open_window_before_a_close";
--from "longest_open_window_before_a_close";
I'd normally expect all following OPEN
s in a sequence to be retransmissions of the first one, similar to what you assume about CLOSE
s, which is why I also added longest_open_window_before_a_close
- pairing up the earliest recorded OPEN
before a CLOSE
. By default, the code uses shortest_open_window_before_a_close
of your choice.
One unaddressed detail I spotted is
difference between the open and close time per user for a given day
Which I think would mean all open windows should be cut off at midnight and all orphaned CLOSES
on the following date should be assumed to have opened at midnight.
Upvotes: 1
Reputation: 4620
We can use lead()
when next action
is closed
.
select *
from (
select userid
,timestamp as start_time
,case lead(action) over(partition by userid order by timestamp) when 'CLOSE' then lead(timestamp) over(partition by userid order by timestamp)-timestamp end as time_elapsed_in_minutes
from t
where action in('OPEN', 'CLOSE')
) t
where time_elapsed_in_minutes is not null
userid | start_time | time_elapsed_in_minutes |
---|---|---|
1 | 2022-10-18 10:00:00 | 00:05:00 |
2 | 2022-10-18 10:20:00 | 00:02:00 |
Upvotes: 0
Reputation: 4620
We group every action = 'close'
with the action = 'open'
that comes before it. Then we group by id and choose the timestamps.
select userid
,min(timestamp) as start_time
,max(timestamp)-min(timestamp) as time_elapsed_in_minutes
from (
select *
,count(case action when 'OPEN' then 1 end) over(partition by userid order by timestamp) as grp
from t
where action in('OPEN', 'CLOSE')
) t
group by userid, grp
userid | start_time | time_elapsed_in_minutes |
---|---|---|
1 | 2022-10-18 10:00:00 | 00:05:00 |
2 | 2022-10-18 10:20:00 | 00:02:00 |
Upvotes: 0