Reputation: 466

Is there a way to find active users in SQL?

I'm trying to find the total count of active users in a database. "Active" users here as defined as those who have registered an event on the selected day or later than the selected day. So if a user registered an event on days 1, 2 and 5, they are counted as "active" throughout days 1, 2, 3, 4 and 5.

My original dataset looks like this (note that this is a sample - the real dataset will run to up to 365 days, and has around 1000 users).

As you can see, all 5 IDs are active on Day 0, and 2 IDs (1 and 2) are active until Day 4, so I'd like the finished table to look like this:

Day    Count
0      5
1      2
2      2
3      2
4      2

I've tried using the following query:

select Day as days, sum(case when Day <= days then 1 else 0 end)
from df

But it gives incorrect output (only counts users who were active on each specific days).

I'm at a loss as to what I could try next. Does anyone have any ideas? Many thanks in advance!

Upvotes: 1

Answers (3)

Gordon Linoff

Reputation: 1270713

I think I would just use generate_series():

select gs.d, count(*)
from (select id, min(day) as min_day, max(day) as max_day
      from t
      group by id
     ) t cross join lateral
     generate_series(t.min_day, .max_day, 1) gs(d)
group by gs.d
order by gs.d;

If you want to count everyone as active from day 1 -- but not all have a value on day 1 -- then use 1 instead of min_day.

Here is a db<>fiddle.

Upvotes: 1

thiaguerd

Reputation: 847

You need count by day.

select
    id,
    count(*)
from df
GROUP BY
    id

Upvotes: 0

Georgi Raychev

Reputation: 1334

A bit verbose, but this should do:

with dt as (
        select 0 d, 1 id
        union all
        select 0 d, 2 id
        union all
        select 0 d, 3 id
        union all
        select 0 d, 4 id
        union all
        select 0 d, 5 id
        union all
        select 1 d, 1 id
        union all
        select 1 d, 2 id
        union all
        select 2 d, 1 id
        union all
        select 3 d, 1 id
        union all
        select 4 d, 1 id
        union all
        select 4 d, 2 id
)
, active_periods as (
        select id
                , min(d) min_d
                , max(d) max_d
        from dt
        group by id
)
, days as (
        select distinct d
        from dt
)
select d.d
        , count(ap.id)
from days d
join active_periods ap on d.d between ap.min_d and ap.max_d
group by 1
order by 1 asc

Upvotes: 0

Is there a way to find active users in SQL?

Answers (3)

Related Questions