Reputation: 103
Let say I have these two tables in mysql.
table1:
date staff_no
2016-06-10 1
2016-06-09 1
2016-05-09 1
2016-04-09 1
table2:
staff_no name
1 David
Then, I have this query to get analysis for the staff for each month:
SELECT DATE_FORMAT(table1.date,'%b %Y') as month,COUNT(table1.date) as total_records,table2.name as name
FROM table1 as table1
LEFT JOIN table2 as table2 on table2.staff_no = table1.staff_no
WHERE table1.staff_no = "1" and date(table1.date) between = "2016-04-01" and "2016-06-30"
GROUP BY table2.name,DATE_FORMAT(table1.date,'%Y-%m')
ORDER BY DATE_FORMAT(table1.date,'%Y-%m-%d')
This query will output:
month total_records name
Apr 2016 1 David
May 2016 1 David
Jun 2016 2 David
But, if I replace the date between "2016-04-01" and "2016-07-31" from the query,it wont show me the July record because it is not exist in table1 which is not what I want. I still want to get result like this:
month total_records name
Apr 2016 1 David
May 2016 1 David
Jun 2016 2 David
Jul 2016 0 David
Anyone expert on this? Kindly help me into this. Thanks!
Upvotes: 1
Views: 1802
Reputation: 48179
You can build an inline set of variables representing all the dates you want by using any other table in your system that has AT LEAST the number of months you are trying to represent even though the data does not have to have dates. Just has records that you can put a limit on.
TRY the following statement that uses MySql variables. The FROM clause declares a variable inline to the SQL statement "@Date1". I am starting it with MARCH 1 of 2016. Now, the select fields list takes that variable and keeps adding 1 month at a time to it. Since it is combined with the "AnyTableWithAtLeast12Records" (literally any table in your system with at least X records), it will create a result showing the dates. This is one way of forcing a calendar type of list.
But notice the SECOND column in this select does not change the @Date1 via the := assignment. So, it takes the date as it now stands and adds another month to it for the END Date. If you need a smaller or larger date range, just change the limit of records to create the calendar spread...
select
@Date1 := date_add( @Date1, interval 1 month ) StartDate,
date_add( @Date1, interval 1 month ) EndDate
from
AnyTableWithAtLeast12Records,
( select @Date1 := '2016-03-01' ) sqlvars
limit 12;
The result is something like...
StartDate EndDate
2016-04-01 2016-05-01
2016-05-01 2016-06-01
2016-06-01 2016-07-01
2016-07-01 2016-08-01
2016-08-01 2016-09-01
2016-09-01 2016-10-01
2016-10-01 2016-11-01
2016-11-01 2016-12-01
2016-12-01 2017-01-01
2017-01-01 2017-02-01
2017-02-01 2017-03-01
2017-03-01 2017-04-01
Now you have your dynamic "Calendar" completed in one simple query. Now, use that as a basis for all the records you need counts for and format as you had. So take the entire query above as a JOIN to find records within those date ranges... No other queries or stored procedures required. Now, a simple LEFT JOIN will keep all dates, but only show those with staff when WITHIN the between range of per start/end. So ex: greater or equal to 04/01/2016, but LESS THEN 05/01/2016 which includes 04/30/2016 @ 11:59:59pm.
SELECT
DATE_FORMAT(MyCalendar.StartDate,'%b %Y') as month,
COALESCE(COUNT(T1.Staff_no),0) as total_records,
COALESCE(T2.name,"") as name
FROM
( select @Date1 := date_add( @Date1, interval 1 month ) StartDate,
date_add( @Date1, interval 1 month ) EndDate
from
AnyTableWithAtLeast12Records,
( select @Date1 := '2016-03-01' ) sqlvars
limit 12 ) MyCalendar
LEFT JOIN table1 T1
ON T1.Date >= MyCalendar.StartDate
AND T1.Date < MyCalendar.EndDate
AND T1.Staff_No = 1
LEFT JOIN table2 T2
ON T1.staff_no = T2.StaffNo
GROUP BY
T2.name,
DATE_FORMAT(MyCalendar.StartDate,'%Y-%m')
ORDER BY
DATE_FORMAT(MyCalendar.StartDate,'%Y-%m-%d')
Upvotes: 1
Reputation: 24960
Consider the following schema with the 3rd table being the year/month Helper Table mentioned. Helper tables are very common and can be re-used throughout your code naturally. I will leave it to you to load it up with substantial date data. Note however the way the end date for each month was put together for those of us that want to do less work, while allowing the db engine to figure out leap years for us.
You could have just one column in that helper table. But that would require the use of function calls for end dates in some of your functions and that means more slowness. We like fast.
create table workerRecords
( id int auto_increment primary key,
the_date date not null,
staff_no int not null
);
-- truncate workerRecords;
insert workerRecords(the_date,staff_no) values
('2016-06-10',1),
('2016-06-09',1),
('2016-05-09',1),
('2016-04-09',1),
('2016-03-02',2),
('2016-07-02',2);
create table workers
( staff_no int primary key,
full_name varchar(100) not null
);
-- truncate workers;
insert workers(staff_no,full_name) values
(1,'David Higgins'),(2,"Sally O'Riordan");
create table ymHelper
( -- Year Month helper table. Used for left joins to pick up all dates.
-- PK is programmer's choice.
dtBegin date primary key, -- by definition not null
dtEnd date null
);
-- truncate ymHelper;
insert ymHelper (dtBegin,dtEnd) values
('2015-01-01',null),('2015-02-01',null),('2015-03-01',null),('2015-04-01',null),('2015-05-01',null),('2015-06-01',null),('2015-07-01',null),('2015-08-01',null),('2015-09-01',null),('2015-10-01',null),('2015-11-01',null),('2015-12-01',null),
('2016-01-01',null),('2016-02-01',null),('2016-03-01',null),('2016-04-01',null),('2016-05-01',null),('2016-06-01',null),('2016-07-01',null),('2016-08-01',null),('2016-09-01',null),('2016-10-01',null),('2016-11-01',null),('2016-12-01',null),
('2017-01-01',null),('2017-02-01',null),('2017-03-01',null),('2017-04-01',null),('2017-05-01',null),('2017-06-01',null),('2017-07-01',null),('2017-08-01',null),('2017-09-01',null),('2017-10-01',null),('2017-11-01',null),('2017-12-01',null),
('2018-01-01',null),('2018-02-01',null),('2018-03-01',null),('2018-04-01',null),('2018-05-01',null),('2018-06-01',null),('2018-07-01',null),('2018-08-01',null),('2018-09-01',null),('2018-10-01',null),('2018-11-01',null),('2018-12-01',null),
('2019-01-01',null),('2019-02-01',null),('2019-03-01',null),('2019-04-01',null),('2019-05-01',null),('2019-06-01',null),('2019-07-01',null),('2019-08-01',null),('2019-09-01',null),('2019-10-01',null),('2019-11-01',null),('2019-12-01',null);
-- will leave as an exercise for you to add more years. Good idea to start, 10 in either direction, at least.
update ymHelper set dtEnd=LAST_DAY(dtBegin); -- data patch. Confirmed leap years.
alter table ymHelper modify dtEnd date not null; -- there, ugly patch above worked fine. Can forget it ever happened (until you add rows)
-- show create table ymHelper; -- this confirms that dtEnd is not null
So that is a helper table. Set it up once and forget about it for a few years
Note: Don't forget to run the above update stmt
SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,
ifnull(COUNT(wr.the_date),0) as total_records,@soloName as full_name
FROM ymHelper ymH
left join workerRecords wr
on wr.the_date between ymH.dtBegin and ymH.dtEnd
and wr.staff_no = 1 and wr.the_date between '2016-04-01' and '2016-07-31'
LEFT JOIN workers w on w.staff_no = wr.staff_no
cross join (select @soloName:=full_name from workers where staff_no=1) xDerived
WHERE ymH.dtBegin between '2016-04-01' and '2016-07-31'
GROUP BY ymH.dtBegin
order by ymH.dtBegin;
+----------+---------------+---------------+
| month | total_records | full_name |
+----------+---------------+---------------+
| Apr 2016 | 1 | David Higgins |
| May 2016 | 1 | David Higgins |
| Jun 2016 | 2 | David Higgins |
| Jul 2016 | 0 | David Higgins |
+----------+---------------+---------------+
It works fine. The first mysql table is the Helper table. A left join to bring in the worker records (allowing for null). Let's pause here. That was afterall the point of your question: missing data. Finally the worker table in a cross join.
The cross join
is to initialize a variable (@soloName
) that is the worker's name. Whereas the null status of missing dates as you requested is picked up fine via the ifnull()
function returning 0, we don't have that luxury for a worker's name. That forces the cross join
.
A cross join is a cartesian product. But since it is a single row, we don't suffer from the normal problems one gets with cartesians causing way to many rows in the result set. Anyway, it works.
But here is one problem: it is too hard to maintain and plug in values in 6 places as can be seen. So consider below a stored proc for it.
drop procedure if exists getOneWorkersRecCount;
DELIMITER $$
create procedure getOneWorkersRecCount
(pStaffNo int, pBeginDt date, pEndDt date)
BEGIN
SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,ifnull(COUNT(wr.the_date),0) as total_records,@soloName as full_name
FROM ymHelper ymH
left join workerRecords wr
on wr.the_date between ymH.dtBegin and ymH.dtEnd
and wr.staff_no = pStaffNo and wr.the_date between pBeginDt and pEndDt
LEFT JOIN workers w on w.staff_no = wr.staff_no
cross join (select @soloName:=full_name from workers where staff_no=pStaffNo) xDerived
WHERE ymH.dtBegin between pBeginDt and pEndDt
GROUP BY ymH.dtBegin
order by ymH.dtBegin;
END$$
DELIMITER ;
call getOneWorkersRecCount(1,'2016-04-01','2016-06-09');
call getOneWorkersRecCount(1,'2016-04-01','2016-06-10');
call getOneWorkersRecCount(1,'2016-04-01','2016-07-01');
call getOneWorkersRecCount(2,'2016-02-01','2016-11-01');
Ah, much easier to work with (in PHP, c#, Java, you name it). Choice is yours, stored proc or not.
drop procedure if exists getAllWorkersRecCount;
DELIMITER $$
create procedure getAllWorkersRecCount
(pBeginDt date, pEndDt date)
BEGIN
SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,ifnull(COUNT(wr.the_date),0) as total_records,w.staff_no,w.full_name
FROM ymHelper ymH
cross join workers w
left join workerRecords wr
on wr.the_date between ymH.dtBegin and ymH.dtEnd
and wr.staff_no = w.staff_no and wr.the_date between pBeginDt and pEndDt
-- LEFT JOIN workers w on w.staff_no = wr.staff_no
-- cross join (select @soloName:=full_name from workers ) xDerived
WHERE ymH.dtBegin between pBeginDt and pEndDt
GROUP BY ymH.dtBegin,w.staff_no,w.full_name
order by ymH.dtBegin,w.staff_no;
END$$
DELIMITER ;
call getAllWorkersRecCount('2016-03-01','2016-08-01');
+----------+---------------+----------+-----------------+
| month | total_records | staff_no | full_name |
+----------+---------------+----------+-----------------+
| Mar 2016 | 0 | 1 | David Higgins |
| Mar 2016 | 1 | 2 | Sally O'Riordan |
| Apr 2016 | 1 | 1 | David Higgins |
| Apr 2016 | 0 | 2 | Sally O'Riordan |
| May 2016 | 1 | 1 | David Higgins |
| May 2016 | 0 | 2 | Sally O'Riordan |
| Jun 2016 | 2 | 1 | David Higgins |
| Jun 2016 | 0 | 2 | Sally O'Riordan |
| Jul 2016 | 0 | 1 | David Higgins |
| Jul 2016 | 1 | 2 | Sally O'Riordan |
| Aug 2016 | 0 | 1 | David Higgins |
| Aug 2016 | 0 | 2 | Sally O'Riordan |
+----------+---------------+----------+-----------------+
Helper Tables have been used for decades. Don't be afraid or embarrassed to use them. In fact, trying to get some specialty work done without them is nearly impossible at times.
Upvotes: 1
Reputation: 1430
I would say you need to have RIGHT JOIN here to include staff from second table
Upvotes: 0