Sam Naholo
Sam Naholo

Reputation: 33

MySQL monthly count of active users

Could some one please assist me creating a mysql monthly report for active users. A user visits a doctor on a particular date. that date is recorded on the visit date. they are then immediately issued an appointment date of when they should come back. A users active months are those from the month of visit all the way up to appointment date month , plus 90 days grace period.. if they dont pitch on the appointment date, they are given a grace period of 90 days, in which they'll still be considered as active users. after that they wont be considered as active anymore.

Users Table
+------------+------------+------------+
|  UserID    |  visit     |Appointment |  
+------------+------------+------------+
| 10001      | 01-01-2010 | 01-02-2010 | 
| 10001      | 05-02-2010 | 01-03-2010 | 
| 10002      | 20-07-2010 | 15-10-2010 | 
| 10003      | 01-11-2010 | 10-11-2010| 
+------------+------------+------------+

The desired results would be

Monthly Report
+------------+------------+------------+
|  Month     |  active    |            |  
+------------+------------+------------+
| 2010-01    | 1          |            | 
| 2010-02    | 1          |            | 
| 2010-03    | 1          |            | 
| 2010-04    | 1          |            | 
| 2010-05    | 1          |            | 
| 2010-07    | 1          |            |
| 2010-08    | 1          |            | 
| 2010-09    | 1          |            | 
| 2010-10    | 1          |            | 
| 2010-11    | 2          |            | 
| 2010-12    | 2          |            | 
| 2011-01    | 1          |            | 
+------------+------------+------------+

Here is the sql code

  CREATE TABLE `visits` (
   `id` int(10) NOT NULL,
   `userid` int(10) NOT NULL,
   `visit` date NOT NULL,
   `appointment` date NOT NULL
   ) ENGINE=InnoDB DEFAULT CHARSET=latin1;



INSERT INTO `visits` (`id`, `userid`, `visit`, `appointment`) VALUES
(1, 10001, '2010-01-01', '2010-02-01'),
(2, 10001, '2010-02-05', '2010-03-01'),
(3, 10002, '2010-07-20', '2010-10-15'),
(4, 10003, '2010-11-01', '2010-11-10');


ALTER TABLE `visits`
ADD PRIMARY KEY (`id`);

ALTER TABLE `visits`
  MODIFY `id` int(10) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=10007;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;

Upvotes: 2

Views: 1237

Answers (2)

Sam Naholo
Sam Naholo

Reputation: 33

For those that might need this.

SELECT
date_format(c.dt,'%Y-%m') AS "month",
c.dt AS date,
a.visit,
a.fu,
COUNT(DISTINCT userid) AS Active
FROM calendar_table c
LEFT JOIN visits v ON c.dt BETWEEN date_format(v.visit,'%Y-%m') AND 
DATE_SUB(date_add(v.appointment, INTERVAL 90 day), INTERVAL 
date_format((LAST_DAY(date_add(v.appointment, INTERVAL 90 day))),'%d')-1 DAY)
WHERE c.d = 1
AND c.y IN (2010,2011)
GROUP BY c.dt

Upvotes: 1

Paul Maxwell
Paul Maxwell

Reputation: 35573

It is typical for this type of requirement that you need some form of "tally table" and/or "calendar table". That is, for your column "month", you really need this is a table of some form. This enables you to represent the date range as a set of rows (one per month n this case).

Once you have the date range represented as rows, then use an outer join to visits using between in the join condition. this will then allow you to count how many users are active per time unit.

For the following example I used code from this article on a calendar table

CREATE TABLE calendar_table (
    dt DATE NOT NULL PRIMARY KEY,
    y SMALLINT NULL,
    q tinyint NULL,
    m tinyint NULL,
    d tinyint NULL,
    dw tinyint NULL,
    monthName VARCHAR(9) NULL,
    dayName VARCHAR(9) NULL,
    w tinyint NULL,
    isWeekday BINARY(1) NULL,
    isHoliday BINARY(1) NULL,
    holidayDescr VARCHAR(32) NULL,
    isPayday BINARY(1) NULL
);

CREATE TABLE ints ( i tinyint );

INSERT INTO ints VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

INSERT INTO calendar_table (dt)
SELECT DATE('2010-01-01') + INTERVAL a.i*10000 + b.i*1000 + c.i*100 + d.i*10 + e.i DAY
FROM ints a JOIN ints b JOIN ints c JOIN ints d JOIN ints e
WHERE (a.i*10000 + b.i*1000 + c.i*100 + d.i*10 + e.i) <= 11322
ORDER BY 1;

UPDATE calendar_table
SET isWeekday = CASE WHEN dayofweek(dt) IN (1,7) THEN 0 ELSE 1 END,
    isHoliday = 0,
    isPayday = 0,
    y = YEAR(dt),
    q = quarter(dt),
    m = MONTH(dt),
    d = dayofmonth(dt),
    dw = dayofweek(dt),
    monthname = monthname(dt),
    dayname = dayname(dt),
    w = week(dt),
    holidayDescr = '';

Now, with your sample data as per you question, using the following query:

select
      date_format(c.dt,'%Y-%m') as "month"
    , count(distinct userid) as active
from calendar_table c
left join visits v on c.dt between v.visit and date_add(v.appointment, INTERVAL 90 DAY)
where c.y in (2010,2011)
group by
      date_format(c.dt,'%Y-%m')

Note this approach will give you 24 rows (each month over 2 years) so you will get more rows with zeros than in your question's expected results. Just adjust the where clause to suit the desired date range.

See this operating as an example at sqlfiddle

Note: no attention has been given to indexing or performance in this example

Upvotes: 0

Related Questions