joeschmidt45
joeschmidt45

Reputation: 1962

SQL- Get count of rows in a list of date ranges

I have a table that has some data on active users. It displays what version the user was on and the date they were active. Let's call this table active_users:

Version   Time
-------------------------
1         '2018-03-12'   
1         '2018-03-01'     
1         '2018-03-06'     
1         '2018-03-09'     
2         '2018-01-02'     
2         '2018-01-04'     
2         '2018-01-05'     
2         '2018-01-11'

I also have a table that returns the date a version was released and a week after the release date. Let's call this table release_dates:

Version   Release_Date   Week_After_Release_Date
------------------------------------------------
1         '2018-03-01'   '2018-03-08'      
2         '2018-01-02'   '2018-01-09'

Now what I want to do in one query is get a count of the number of users who were active between the dates listed in the row from the second table.

The result would look like this:

Version   Count  
------------------------------------------------
1         2
2         3

as there are two rows in active_users that are version 1 and have dates between 2018-03-01 and 2018-03-08 and three rows that are version 2 and have dates between 2018-01-02 and 2018-01-09.

This would be easy to do with a for-loop because I could just iterate over each row in release_dates, do a query in active_users with those parameters and concatenate all the results at the end. Unfortunately I am working with the restraint of doing a single query, do we know if this is possible? I am using MYSQL as my database.

Upvotes: 1

Views: 73

Answers (3)

Maokai
Maokai

Reputation: 372

First of all, you want to know which users are active during each time interval in the second table by outer join all records in the first table and the second table:

SELECT
  au.Version, au.Time
FROM
  active_users AS au, release_dates AS rd 
WHERE 
  au.Version = rd.Version AND au.Time >= rd.Release_Date AND au.Time < rd.Week_After_Release_Date

You can get the following result:

 Version |    Time
---------+------------
       1 | 2018-03-01
       1 | 2018-03-06
       2 | 2018-01-02
       2 | 2018-01-04
       2 | 2018-01-05

Based on this result, you can group by Version and get your final stats:

SELECT
  Version, COUNT(*)
FROM
  (
    SELECT
      au.Version
    FROM
      active_users AS au, release_dates AS rd 
    WHERE 
      au.Version = rd.Version AND au.Time >= rd.Release_Date AND au.Time < rd.Week_After_Release_Date
  ) v
GROUP BY
  Version

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269443

One way is a correlated subquery:

select rd.*,
       (select count(*)
        from active_users au
        where au.time >= rd.release_date and
              au.time < Week_After_Release_Date + interval 1 day
       ) 
from release_dates rd;

Note that I do not use between on the dates, just in case the dates have a time component.

Upvotes: 0

jpw
jpw

Reputation: 44871

One way to get the result you want is to combine an aggregate operation with a conditional selection and only count the rows that matches the condition:

select 
  au.version, 
  sum(if(au.time between rd.release_date and rd.week_after_release_date,1,0)) as count 
from active_users au
join release_dates rd on au.version = rd.version
group by au.version

If you prefer to use the count function the third row could be written as:

count(if(au.time between rd.release_date and rd.week_after_release_date, au.time, null)) as count

If you want it possibly slightly more portable you could use case instead of if

Sample SQL Fiddle

Upvotes: 1

Related Questions