Reputation: 11387
My table has ACTIONS with USER and date/time of each action. A group of Actions by the same user is a session. I want to measure the duration of each session. However if there is a gap larger than 5 minutes without any action, then I want to start a new session.
I am using this SQL
Select USER,
TRUNC((Max(ACTION_DATE) - Min(ACTION_DATE))*24*60,2) as Minutes
from USER_ACTION_DATA
Group by USER,TRUNC(ACTION_DATE,'J') ;
And it works pretty well if the user only has one session per day. But if the data is
USER ACTION_DATE
---------------------
john 2021-05-24 11:30:22
john 2021-05-24 11:32:12
john 2021-05-24 11:32:44
john 2021-05-24 11:36:08
john 2021-05-24 14:20:02
john 2021-05-24 14:23:52
it will show a single session with 173 minutes. But that is wrong. It should be two sessions with 6 and 3 minutes (because the gap between the 4th and 5th record is more than 5 minutes). Is this possible with SQL, or do I have to do the grouping with a real programming language?
Upvotes: 1
Views: 84
Reputation: 191425
If you're on a recent version of Oracle you can use match_recognize
:
select user_name,
start_date,
end_date,
trunc((end_date - start_date) * 24 * 60, 2) as minutes
from user_action_data
match_recognize (
partition by user_name
order by action_date
measures
first(action_date) as start_date,
last(action_date) as end_date
pattern (A B*)
define B as action_date <= prev(action_date) + interval '5' minute
);
USER_NAME | START_DATE | END_DATE | MINUTES |
---|---|---|---|
john | 2021-05-24 11:30:22 | 2021-05-24 11:36:08 | 5.76 |
john | 2021-05-24 14:20:02 | 2021-05-24 14:23:52 | 3.83 |
USER
is a reserved word (and a function) so I've changed the column name to USER_NAME
to make it valid.
Because it's only looking at the interval it will allow a session to span midnight, rather than restricting to sessions within a day, as you are doing by truncating with J. I'm assuming that's a good thing, of course. If it isn't then you can change it to only look within the same day:
define B as trunc(action_date) = trunc(prev(action_date))
and action_date <= prev(action_date) + interval '5' minute
db<>fiddle with some additional sample data to go into the next day.
Upvotes: 1
Reputation: 1270503
You can assign a session using window functions -- check when the previous action is and assigning a session start when needed:
select uad.*,
sum(case when prev_action_date > action_date - interval '5' minute then 0 else 1 end) over
(partition by user order by action_date) as session_id
from (select uad.*,
lag(action_date) over (partition by user order by action_date) as prev_action_date
from USER_ACTION_DATA uad
) uad;
You can then aggregate this if you like:
select user, min(action_date), max(action_date),
( max(action_date) - min(action_date) ) * 24*60*60
from (select uad.*,
sum(case when prev_action_date > action_date - interval '5' minute then 0 else 1 end) over
(partition by user order by action_date) as session_id
from (select uad.*,
lag(action_date) over (partition by user order by action_date) as prev_action_date
from USER_ACTION_DATA uad
) uad
) uad
group by user, session_id;
Upvotes: 1