Reputation: 2187
Looking for an elegant solution to this... I find that anything I do is quite ugly... I simply want to plot, for each month, how many users are active (within their start/end dates). Naturally, I need a count of 0 where the month has no active user in that report date range...
TABLE MEMBERSHIPS
Name, Start_date, End_date
Joe, 2017/02/01, 2017/04/01
Bob, 2017/03/01, 2017/05/01
Moe, 2017/03/01, 2017/05/01
Lou, 2017/04/01, 2017/05/01
So I need a report of active members by month... that is, a count of how many members are active in a given month given a range... So expected results:
Reoort for 2017/01/01 to 2017/06/01
Month, Count
01 | 0
02 | 1
03 | 3
04 | 4
05 | 3
06 | 0
I am sure this is a common use case... just could not find anything nice...
Thanks,
Upvotes: 1
Views: 569
Reputation: 4425
The problem here is that a part of the data you need to create the report is missing. How should the database know that you want to include month 01 although there is no data with that month?
An easy way to solve this is creating a table which contains all months the report should include. To simplify things further, instead of the year and the month itself, just write an arbitrary date which lies in the respective month into that table.
Table ReportMonths:
ReportMonth datetime
2017-01-15
2017-02-15
2017-03-15
2017-14-15
2017-05-15
2017-06-15
To create the report using the new table:
SELECT
ReportMonths.ReportMonth,
COUNT(*)
FROM
ReportMonths, MEMBERSHIPS
WHERE
(MEMBERSHIPS.Start_date <= ReportMonths.ReportMonth) AND
(MEMBERSHIPS.End_date > ReportMonths.ReportMonth)
GROUP BY
ReportMonths.ReportMonth;
DISCLAIMER
1) I don't have MySQL at hands right now, so I can't test the code. Please forgive me if there are syntax errors. It should give you a general how to solve your problem, though.
2) If you have millions of rows, that solution might be slow or might make MySQL freak out since it has to build the cross product of the two tables first. If this is the case, we could construct a more sophisticated query which behaves better. Tell us if you are interested ...
DISCLAIMER END
The solution above will return the month in question as date. If you don't want that, then SELECT YEAR(ReportMonths.ReportMonth), MONTH(ReportMonths.ReportMonth), ...
instead of SELECT ReportMonths.ReportMonth, ...
.
If you don't want to use that second table, then I am afraid you will have to write a stored procedure or a back-end application which loops through every year and month and in every loop selects the count of the rows which start before and end after the current year / month, and then writes that count somewhere, possibly another table or an array, for further processing.
The advantage would be that the database would not have to build the cross product between two tables; the disadvantage would be that this is not a one-query-solution.
Upvotes: 2