Reputation: 11
I have a database with every action my users have done timestamped which looks like this:
user_id
; action
; creation_date
I want to assess the number of active users over time. I can see the number of users that are active per week (count (distinct user_id)), but I want to know the number of users that have been active since the beginning of the year to each week like this :
-number of users active week1
-number of users active week1 or week 2
-number of users active week1, week2 or week 3
etc
I can't just add the active users each week since they come back so if I did some would be counted double.
For now I can do it per week with this query:
SELECT count(distinct uca.user_id)
FROM DATABASE
WHERE (extract(week from uca.creation_date)) <= 9
How can I have a single query that gives me this for every week ?
(sorry this was too long I realise I wanted to be extra clear)
Thank you!
Upvotes: 1
Views: 61
Reputation: 4425
I'm afraid the correlated subquery in Gordon Linoff's answer might be executed for each line of the input, which is inefficient. Here is a way equivalent to executing it only once for each line of the output:
SELECT
year(creation_date),
week(creation_date),
count(distinct user_id) as YTD_users
FROM uca
JOIN (SELECT year(creation_date) as year, week(creation_date) as week
FROM uca
GROUP BY year(creation_date), week(creation_date)
) AS weeks
ON year(creation_date) = year AND week(creation_date) <= week
Upvotes: 0
Reputation: 1271231
You can use a subquery. Your sample query is a bit confusing (table alias uca
is not defined, you are only extracting the week and not the year). I think you want something like this:
SELECT year(uca.creation_date), week(uca.creation_date),
count(distinct uca.user_id) as WEEK_users,
(select count(distinct uca2.user_id)
from database uca2
where year(uca2.creation_date) = year(uca.creation_date) and
week(uca2.creation_date) <= week(uca.creation_date)
) as YTD_users
FROM DATABASE uca
GROUP BY year(uca.creation_date), week(uca.creation_date)
Upvotes: 3