Reputation: 11
I have a table, activity
that looks like the following:
date | user_id |
Thousands of users and multiple dates and activity for all of them. I want to pull a query that will, for every day in the result, give me the total active users in the last 30 days. The query I have now looks like the following:
select date, count(distinct user_id) from activity where date > date_sub(date, interval 30 day) group by date
This gives me total unique users on only that day; I can't get it to give me the last 30 for each date. Help is appreciated.
Upvotes: 1
Views: 2990
Reputation: 21533
To do this you need a list of the dates and join that against the activities.
As such this should do it. A sub query to get the list of dates and then a count of user_id (or you could use COUNT(*) as I presume user_id cannot be null):-
SELECT date, COUNT(user_id)
FROM
(
SELECT DISTINCT date, DATE_ADD(b.date, INTERVAL -30 DAY) AS date_minus_30
FROM activity
) date_ranges
INNER JOIN activity
ON activity.date BETWEEN date_ranges.date_minus_30 AND date_ranges.date
GROUP BY date
However if there can be multiple records for a user_id on any particular date but you only want the count of unique user_ids on a date you need to count DISTINCT user_id (although note that if a user id occurs on 2 different dates within the 30 day date range they will only be counted once):-
SELECT activity.date, COUNT(DISTINCT user_id)
FROM
(
SELECT DISTINCT date, DATE_ADD(b.date, INTERVAL -30 DAY) AS date_minus_30
FROM activity
) date_ranges
INNER JOIN activity
ON activity.date BETWEEN date_ranges.date_minus_30 AND date_ranges.date
GROUP BY date
A bit cruder would be to just join the activity table against itself based on the date range and use COUNT(DISTINCT ...) to just eliminate the duplicates:-
SELECT a.date, COUNT(DISTINCT a.user_id)
FROM activity a
INNER JOIN activity b
ON a.date BETWEEN DATE_ADD(b.date, INTERVAL -30 DAY) AND b.date
GROUP by a.date
Upvotes: 2