Reputation: 1962
I have a table that has some data on active users. It displays what version the user was on and the date they were active. Let's call this table active_users
:
Version Time
-------------------------
1 '2018-03-12'
1 '2018-03-01'
1 '2018-03-06'
1 '2018-03-09'
2 '2018-01-02'
2 '2018-01-04'
2 '2018-01-05'
2 '2018-01-11'
I also have a table that returns the date a version was released and a week after the release date. Let's call this table release_dates
:
Version Release_Date Week_After_Release_Date
------------------------------------------------
1 '2018-03-01' '2018-03-08'
2 '2018-01-02' '2018-01-09'
Now what I want to do in one query is get a count of the number of users who were active between the dates listed in the row from the second table.
The result would look like this:
Version Count
------------------------------------------------
1 2
2 3
as there are two rows in active_users
that are version 1 and have dates between 2018-03-01
and 2018-03-08
and three rows that are version 2 and have dates between 2018-01-02
and 2018-01-09
.
This would be easy to do with a for-loop because I could just iterate over each row in release_dates
, do a query in active_users
with those parameters and concatenate all the results at the end. Unfortunately I am working with the restraint of doing a single query, do we know if this is possible? I am using MYSQL as my database.
Upvotes: 1
Views: 73
Reputation: 372
First of all, you want to know which users are active during each time interval in the second table by outer join all records in the first table and the second table:
SELECT
au.Version, au.Time
FROM
active_users AS au, release_dates AS rd
WHERE
au.Version = rd.Version AND au.Time >= rd.Release_Date AND au.Time < rd.Week_After_Release_Date
You can get the following result:
Version | Time
---------+------------
1 | 2018-03-01
1 | 2018-03-06
2 | 2018-01-02
2 | 2018-01-04
2 | 2018-01-05
Based on this result, you can group by Version
and get your final stats:
SELECT
Version, COUNT(*)
FROM
(
SELECT
au.Version
FROM
active_users AS au, release_dates AS rd
WHERE
au.Version = rd.Version AND au.Time >= rd.Release_Date AND au.Time < rd.Week_After_Release_Date
) v
GROUP BY
Version
Upvotes: 0
Reputation: 1269443
One way is a correlated subquery:
select rd.*,
(select count(*)
from active_users au
where au.time >= rd.release_date and
au.time < Week_After_Release_Date + interval 1 day
)
from release_dates rd;
Note that I do not use between
on the dates, just in case the dates have a time component.
Upvotes: 0
Reputation: 44871
One way to get the result you want is to combine an aggregate operation with a conditional selection and only count the rows that matches the condition:
select
au.version,
sum(if(au.time between rd.release_date and rd.week_after_release_date,1,0)) as count
from active_users au
join release_dates rd on au.version = rd.version
group by au.version
If you prefer to use the count function the third row could be written as:
count(if(au.time between rd.release_date and rd.week_after_release_date, au.time, null)) as count
If you want it possibly slightly more portable you could use case instead of if
Upvotes: 1