EDDY
EDDY

Reputation: 103

Select each month even though the month not exist in mysql table

Let say I have these two tables in mysql.

table1:

date         staff_no
2016-06-10   1
2016-06-09   1
2016-05-09   1
2016-04-09   1

table2:

staff_no    name
1           David

Then, I have this query to get analysis for the staff for each month:

SELECT DATE_FORMAT(table1.date,'%b %Y') as month,COUNT(table1.date) as total_records,table2.name as name
FROM table1 as table1 
LEFT JOIN table2 as table2 on table2.staff_no = table1.staff_no
WHERE table1.staff_no = "1" and date(table1.date) between = "2016-04-01" and "2016-06-30" 
GROUP BY table2.name,DATE_FORMAT(table1.date,'%Y-%m')
ORDER BY DATE_FORMAT(table1.date,'%Y-%m-%d')

This query will output:

month      total_records  name
Apr 2016               1  David
May 2016               1  David
Jun 2016               2  David

But, if I replace the date between "2016-04-01" and "2016-07-31" from the query,it wont show me the July record because it is not exist in table1 which is not what I want. I still want to get result like this:

month      total_records  name
Apr 2016               1  David
May 2016               1  David
Jun 2016               2  David
Jul 2016               0  David   

Anyone expert on this? Kindly help me into this. Thanks!

Upvotes: 1

Views: 1802

Answers (3)

DRapp
DRapp

Reputation: 48179

You can build an inline set of variables representing all the dates you want by using any other table in your system that has AT LEAST the number of months you are trying to represent even though the data does not have to have dates. Just has records that you can put a limit on.

TRY the following statement that uses MySql variables. The FROM clause declares a variable inline to the SQL statement "@Date1". I am starting it with MARCH 1 of 2016. Now, the select fields list takes that variable and keeps adding 1 month at a time to it. Since it is combined with the "AnyTableWithAtLeast12Records" (literally any table in your system with at least X records), it will create a result showing the dates. This is one way of forcing a calendar type of list.

But notice the SECOND column in this select does not change the @Date1 via the := assignment. So, it takes the date as it now stands and adds another month to it for the END Date. If you need a smaller or larger date range, just change the limit of records to create the calendar spread...

select
     @Date1 := date_add( @Date1, interval 1 month ) StartDate,
      date_add( @Date1, interval 1 month ) EndDate
    from
      AnyTableWithAtLeast12Records,
      ( select @Date1 := '2016-03-01' ) sqlvars
   limit 12;

The result is something like...

StartDate   EndDate
2016-04-01  2016-05-01
2016-05-01  2016-06-01
2016-06-01  2016-07-01
2016-07-01  2016-08-01
2016-08-01  2016-09-01
2016-09-01  2016-10-01
2016-10-01  2016-11-01
2016-11-01  2016-12-01
2016-12-01  2017-01-01
2017-01-01  2017-02-01
2017-02-01  2017-03-01
2017-03-01  2017-04-01

Now you have your dynamic "Calendar" completed in one simple query. Now, use that as a basis for all the records you need counts for and format as you had. So take the entire query above as a JOIN to find records within those date ranges... No other queries or stored procedures required. Now, a simple LEFT JOIN will keep all dates, but only show those with staff when WITHIN the between range of per start/end. So ex: greater or equal to 04/01/2016, but LESS THEN 05/01/2016 which includes 04/30/2016 @ 11:59:59pm.

SELECT 
      DATE_FORMAT(MyCalendar.StartDate,'%b %Y') as month,
      COALESCE(COUNT(T1.Staff_no),0) as total_records,
      COALESCE(T2.name,"") as name
   FROM 
      ( select @Date1 := date_add( @Date1, interval 1 month ) StartDate,
               date_add( @Date1, interval 1 month ) EndDate
           from
              AnyTableWithAtLeast12Records,
              ( select @Date1 := '2016-03-01' ) sqlvars
           limit 12 ) MyCalendar
        LEFT JOIN table1 T1
           ON T1.Date >= MyCalendar.StartDate
           AND T1.Date < MyCalendar.EndDate
           AND T1.Staff_No = 1
           LEFT JOIN table2 T2
              ON T1.staff_no = T2.StaffNo
   GROUP BY
      T2.name,
      DATE_FORMAT(MyCalendar.StartDate,'%Y-%m')
   ORDER BY 
      DATE_FORMAT(MyCalendar.StartDate,'%Y-%m-%d')

Upvotes: 1

Drew
Drew

Reputation: 24960

Consider the following schema with the 3rd table being the year/month Helper Table mentioned. Helper tables are very common and can be re-used throughout your code naturally. I will leave it to you to load it up with substantial date data. Note however the way the end date for each month was put together for those of us that want to do less work, while allowing the db engine to figure out leap years for us.

You could have just one column in that helper table. But that would require the use of function calls for end dates in some of your functions and that means more slowness. We like fast.

Schema

create table workerRecords
(   id int auto_increment primary key,
    the_date date not null,
    staff_no int not null
);
-- truncate workerRecords;
insert workerRecords(the_date,staff_no) values
('2016-06-10',1),
('2016-06-09',1),
('2016-05-09',1),
('2016-04-09',1),
('2016-03-02',2),
('2016-07-02',2);

create table workers
(   staff_no int primary key,
    full_name varchar(100) not null
);
-- truncate workers;
insert workers(staff_no,full_name) values
(1,'David Higgins'),(2,"Sally O'Riordan");

Helper table below

create table ymHelper
(   -- Year Month helper table. Used for left joins to pick up all dates.
    -- PK is programmer's choice.
    dtBegin date primary key,   -- by definition not null
    dtEnd date null
);
-- truncate ymHelper;
insert ymHelper (dtBegin,dtEnd) values
('2015-01-01',null),('2015-02-01',null),('2015-03-01',null),('2015-04-01',null),('2015-05-01',null),('2015-06-01',null),('2015-07-01',null),('2015-08-01',null),('2015-09-01',null),('2015-10-01',null),('2015-11-01',null),('2015-12-01',null),
('2016-01-01',null),('2016-02-01',null),('2016-03-01',null),('2016-04-01',null),('2016-05-01',null),('2016-06-01',null),('2016-07-01',null),('2016-08-01',null),('2016-09-01',null),('2016-10-01',null),('2016-11-01',null),('2016-12-01',null),
('2017-01-01',null),('2017-02-01',null),('2017-03-01',null),('2017-04-01',null),('2017-05-01',null),('2017-06-01',null),('2017-07-01',null),('2017-08-01',null),('2017-09-01',null),('2017-10-01',null),('2017-11-01',null),('2017-12-01',null),
('2018-01-01',null),('2018-02-01',null),('2018-03-01',null),('2018-04-01',null),('2018-05-01',null),('2018-06-01',null),('2018-07-01',null),('2018-08-01',null),('2018-09-01',null),('2018-10-01',null),('2018-11-01',null),('2018-12-01',null),
('2019-01-01',null),('2019-02-01',null),('2019-03-01',null),('2019-04-01',null),('2019-05-01',null),('2019-06-01',null),('2019-07-01',null),('2019-08-01',null),('2019-09-01',null),('2019-10-01',null),('2019-11-01',null),('2019-12-01',null);
-- will leave as an exercise for you to add more years. Good idea to start, 10 in either direction, at least.
update ymHelper set dtEnd=LAST_DAY(dtBegin);    -- data patch. Confirmed leap years.
alter table ymHelper modify dtEnd date not null;    -- there, ugly patch above worked fine. Can forget it ever happened (until you add rows)
-- show create table ymHelper; -- this confirms that dtEnd is not null

So that is a helper table. Set it up once and forget about it for a few years

Note: Don't forget to run the above update stmt

Quick Test for your Query

SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,
ifnull(COUNT(wr.the_date),0) as total_records,@soloName as full_name 
FROM ymHelper ymH 
left join workerRecords wr 
on wr.the_date between ymH.dtBegin and ymH.dtEnd 
and wr.staff_no = 1 and wr.the_date between '2016-04-01' and '2016-07-31' 
LEFT JOIN workers w on w.staff_no = wr.staff_no 
cross join (select @soloName:=full_name from workers where staff_no=1) xDerived 
WHERE ymH.dtBegin between '2016-04-01' and '2016-07-31' 
GROUP BY ymH.dtBegin 
order by ymH.dtBegin; 

+----------+---------------+---------------+
| month    | total_records | full_name     |
+----------+---------------+---------------+
| Apr 2016 |             1 | David Higgins |
| May 2016 |             1 | David Higgins |
| Jun 2016 |             2 | David Higgins |
| Jul 2016 |             0 | David Higgins |
+----------+---------------+---------------+

It works fine. The first mysql table is the Helper table. A left join to bring in the worker records (allowing for null). Let's pause here. That was afterall the point of your question: missing data. Finally the worker table in a cross join.

The cross join is to initialize a variable (@soloName) that is the worker's name. Whereas the null status of missing dates as you requested is picked up fine via the ifnull() function returning 0, we don't have that luxury for a worker's name. That forces the cross join.

A cross join is a cartesian product. But since it is a single row, we don't suffer from the normal problems one gets with cartesians causing way to many rows in the result set. Anyway, it works.

But here is one problem: it is too hard to maintain and plug in values in 6 places as can be seen. So consider below a stored proc for it.

Stored Proc

drop procedure if exists getOneWorkersRecCount;
DELIMITER $$
create procedure getOneWorkersRecCount
(pStaffNo int, pBeginDt date, pEndDt  date)
BEGIN
    SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,ifnull(COUNT(wr.the_date),0) as total_records,@soloName as full_name
    FROM ymHelper ymH 
    left join workerRecords wr 
    on wr.the_date between ymH.dtBegin and ymH.dtEnd 
    and wr.staff_no = pStaffNo and wr.the_date between pBeginDt and pEndDt
    LEFT JOIN workers w on w.staff_no = wr.staff_no 
    cross join (select @soloName:=full_name from workers where staff_no=pStaffNo) xDerived
    WHERE ymH.dtBegin between pBeginDt and pEndDt 
    GROUP BY ymH.dtBegin
    order by ymH.dtBegin;
END$$
DELIMITER ;

Test the stored proc a number of times

call getOneWorkersRecCount(1,'2016-04-01','2016-06-09');
call getOneWorkersRecCount(1,'2016-04-01','2016-06-10');
call getOneWorkersRecCount(1,'2016-04-01','2016-07-01');
call getOneWorkersRecCount(2,'2016-02-01','2016-11-01');

Ah, much easier to work with (in PHP, c#, Java, you name it). Choice is yours, stored proc or not.

Bonus Stored Proc

drop procedure if exists getAllWorkersRecCount;
DELIMITER $$
create procedure getAllWorkersRecCount
(pBeginDt date, pEndDt  date)
BEGIN
    SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,ifnull(COUNT(wr.the_date),0) as total_records,w.staff_no,w.full_name
    FROM ymHelper ymH 
    cross join workers w 
    left join workerRecords wr 
    on wr.the_date between ymH.dtBegin and ymH.dtEnd 
    and wr.staff_no = w.staff_no and wr.the_date between pBeginDt and pEndDt
    -- LEFT JOIN workers w on w.staff_no = wr.staff_no 
    -- cross join (select @soloName:=full_name from workers ) xDerived
    WHERE ymH.dtBegin between pBeginDt and pEndDt 
    GROUP BY ymH.dtBegin,w.staff_no,w.full_name
    order by ymH.dtBegin,w.staff_no;
END$$
DELIMITER ;

Quick test of it

call getAllWorkersRecCount('2016-03-01','2016-08-01');
+----------+---------------+----------+-----------------+
| month    | total_records | staff_no | full_name       |
+----------+---------------+----------+-----------------+
| Mar 2016 |             0 |        1 | David Higgins   |
| Mar 2016 |             1 |        2 | Sally O'Riordan |
| Apr 2016 |             1 |        1 | David Higgins   |
| Apr 2016 |             0 |        2 | Sally O'Riordan |
| May 2016 |             1 |        1 | David Higgins   |
| May 2016 |             0 |        2 | Sally O'Riordan |
| Jun 2016 |             2 |        1 | David Higgins   |
| Jun 2016 |             0 |        2 | Sally O'Riordan |
| Jul 2016 |             0 |        1 | David Higgins   |
| Jul 2016 |             1 |        2 | Sally O'Riordan |
| Aug 2016 |             0 |        1 | David Higgins   |
| Aug 2016 |             0 |        2 | Sally O'Riordan |
+----------+---------------+----------+-----------------+

The Takeaway

Helper Tables have been used for decades. Don't be afraid or embarrassed to use them. In fact, trying to get some specialty work done without them is nearly impossible at times.

Upvotes: 1

Stan
Stan

Reputation: 1430

I would say you need to have RIGHT JOIN here to include staff from second table

Upvotes: 0

Related Questions