HerrimanCoder
HerrimanCoder

Reputation: 7218

Poor query performance in MySQL join with Group By

I have a MySQL query that consistently takes 7 seconds to complete:

SELECT g.GroupID, g.GroupName, count(s.EventDate) AS Total 
FROM Groups g 
LEFT JOIN Schedule s 
ON g.GroupID = s.GroupID 
WHERE g.OrganizationID = 2 
AND g.IsActive = 1 
AND IFNULL(g.IsDeleted,0) = 0 
AND IFNULL(g.IsHidden,0) = 0 
AND (s.EventDate > DATE_ADD(NOW(), INTERVAL 0 HOUR) OR g.CreateDate > DATE_ADD(DATE_ADD(NOW(), INTERVAL -1 DAY), INTERVAL 0 HOUR)) 
GROUP BY g.GroupID, g.GroupName 
ORDER BY s.EventDate, g.GroupName;

I already have the following indexes:

ALTER TABLE Groups ADD INDEX (OrganizationID);
ALTER TABLE Groups ADD INDEX (IsActive);
ALTER TABLE Groups ADD INDEX (CreateDate);
ALTER TABLE Groups ADD INDEX (IsDeleted);
ALTER TABLE Groups ADD INDEX (IsHidden);
ALTER TABLE Groups ADD INDEX (GroupName);
ALTER TABLE Schedule ADD INDEX (EventDate);

Before the indexes were added the query took 18 seconds, so it's a huge improvement already. But I cannot understand why this query is still taking so long. The largest table in the join is Schedule, which has 13,216 records--not a huge table! The other tables are much smaller.

EXPLAIN results: enter image description here

What can I do to improve my query performance?

Upvotes: 0

Views: 47

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269713

This is query:

SELECT g.GroupID, g.GroupName, count(s.EventDate) AS Total 
FROM Groups g LEFT JOIN
     Schedule s 
     ON g.GroupID = s.GroupID 
WHERE g.OrganizationID = 2 AND
      g.IsActive = 1 AND
      IFNULL(g.IsDeleted,0) = 0 AND
      IFNULL(g.IsHidden,0) = 0 AND
      (s.EventDate > DATE_ADD(NOW(), INTERVAL 0 HOUR) OR g.CreateDate > DATE_ADD(DATE_ADD(NOW(), INTERVAL -1 DAY), INTERVAL 0 HOUR)) 
GROUP BY g.GroupID, g.GroupName 
ORDER BY s.EventDate, g.GroupName;

The best index for this query is a composite index on groups: groups(OrganizationId, IsActive, IsDeleted, IsHidden, CreateDate, GroupId). The last three columns are for covering the where clause. And then an index on schedule(GroupId, EventDate). The lack of index on schedule(GroupId) is probably the root of your performance problems.

Next, you are ordering by s.EventDate. This doesn't make sense. You are not selecting s.eventDate and it is ill defined when there is more than one value.

I don't understand what the double date_add() is supposed to be accomplishing. Adding interval 0 hour doesn't do anything. These operations do not affect performance, but they look awkward.

Upvotes: 2

Related Questions