Reputation: 23
For each user login to our website, we insert a record into our user_session table with the user's login and logout timestamps. If I wanted to produce a graph of the number of logins per hour over time, it would be easy with the following SQL.
SELECT
date_trunc('hour',login_time) AS "time",
count(*)
FROM user_session
group by time
order by time
Time would be the X-axis and count would be the Y-axis.
But what I really need is the number of active sessions in each hour where "active" means login_time <= foo and logout_time >= foo where foo is the particular time slot.
How can I do this in a single SELECT statement?
Upvotes: 1
Views: 1098
Reputation: 1271241
One brute force method generates the hours and then uses a lateral join or correlated subquery to do the calculation:
select gs.ts, us.num_active
from generate_series('2021-03-21'::timestamp, '2021-03-22'::timestamp, interval '1 hour') gs(ts) left join lateral
(select count(*) as num_active
from user_session us
where us.login_time <= gs.ts and
us.logout_time > gs.ts
) us
on 1=1;
A more efficient method -- particularly for longer periods of time -- is to pivot the times and keep an incremental count is ins and outs:
with cte as (
select date_trunc('hour', login_time) as hh, count(*) as inc
from user_session
group by hh
select date_trunc('hour', logout_time + interval '1 hour') as hh, - count(*) as inc
from user_session
group by hh
)
select hh, sum(inc) as net_in_hour,
sum(sum(inc)) over (order by hh) as active_in_hour
from cte
group by hh;
Upvotes: 1