Vlad Aleshin
Vlad Aleshin

Reputation: 411

Define user's sessions (sql)

I have an event table (user_id, timestamp). I need to write a query to define a user session (every user can have more than one session and every session can have >= 1 event). 30 minutes of inactivity for the user is a completed session. The output table should have the following format: (user_id, start_session, end_sesson). I wrote part of query, but what to do next i have no idea.

select  
t.user_id,
t.ts start_session,
t.next_ts
from ( select 
          user_id,
          ts,
          DATEDIFF(SECOND, lag(ts, 1) OVER(partition by user_id order by ts), ts) next_ts   
      from 
          events_tabl ) t

Upvotes: 0

Views: 331

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270623

You want a cumulative sum to identify the sessions and then aggregation:

select user_id, session_id, min(ts), max(ts)
from (select e.*,
             sum(case when prev_ts > dateadd(minute, -30, ts)
                      then 0 else 1
                 end) over (partition by user_id order by ts) as session_id
      from (select e.*,
                   lag(ts) over (partition by user_id order by ts), ts) as prev_ts   
            from events_tabl e
           ) e
     ) e
group by user_id, session_id;

Note that I changed the date/time logic from using datediff() to a direct comparison of the times. datediff() counts the number of "boundaries" between two times. So, there is 1 hour between 12:59 a.m. and 1:01 a.m. -- but zero hours between 1:01 a.m. and 1:59 a.m.

Although handling the diffs at the second level produces similar results, you can run into occasions where you are working with seconds or milliseconds -- but the time spans are too long to fit into an integer. Overflow errors. It is just easier to work directly with the date/time values.

Upvotes: 1

Related Questions