Reputation: 4779
I've been messing around all day trying to find why my query performance is terrible. It is extremely simple, yet can take over 15 minutes to execute (I abort the query at that stage). I am joining a table with over 2 million records.
This is the select:
SELECT
audit.MessageID, alerts.AlertCount
FROM
audit
LEFT JOIN (
SELECT MessageID, COUNT(ID) AS 'AlertCount'
FROM alerts
GROUP BY MessageID
) AS alerts ON alerts.MessageID = audit.MessageID
This is the EXPLAIN
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | PRIMARY | AL | index | NULL | IDX_audit_MessageID | 4 | NULL | 2330944 | 100.00 | Using index |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 124140 | 100.00 | |
| 2 | DERIVED | alerts | index | NULL | IDX_alerts_MessageID | 5 | NULL | 124675 | 100.00 | Using index |
This is the schema:
# Not joining, just showing types
CREATE TABLE messages (
ID int NOT NULL AUTO_INCREMENT,
MessageID varchar(255) NOT NULL,
PRIMARY KEY (ID),
INDEX IDX_messages_MessageID (MessageID)
);
# 2,324,931 records
CREATE TABLE audit (
ID int NOT NULL AUTO_INCREMENT,
MessageID int NOT NULL,
LogTimestamp timestamp NOT NULL,
PRIMARY KEY (ID),
INDEX IDX_audit_MessageID (MessageID),
CONSTRAINT FK_audit_MessageID FOREIGN KEY(MessageID) REFERENCES messages(ID)
);
# 124,140
CREATE TABLE alerts (
ID int NOT NULL AUTO_INCREMENT,
AlertLevel int NOT NULL,
Text nvarchar(4096) DEFAULT NULL,
MessageID int DEFAULT 0,
PRIMARY KEY (ID),
INDEX IDX_alert_MessageID (MessageID),
CONSTRAINT FK_alert_MessageID FOREIGN KEY(MessageID) REFERENCES messages(ID)
);
A few very important things to note - the MessageID is not 1:1 in either 'audit' or 'alerts'; The MessageID can exist in one table, but not the other, or may exist in both (which is the purpose of my join); In my test DB, none of the MessageID exist in both. In other words, my query will return 2.3 million records with 0 as the count.
Another thing to note is that the 'audit' and 'alert' tables used to use MessageID as varchar(255). I created the 'messages' table expecting that it would fix the join. It actually made it worse. Previously, it would take 78 seconds, now, it never returns.
What am I missing about MySQL?
Upvotes: 1
Views: 136
Reputation: 98459
You're joining to a subquery.
The subquery results are effectively a temporary table - note the <derived2>
in the query execution plan. As you can see there, they're not indexed, since they're ephemeral.
You should execute the query as a single unit with a join, rather than joining to the results of a second query.
EDIT: Andrew has posted an answer with one example of how to do your work in a normal join query, instead of in two steps.
Upvotes: 1
Reputation: 4624
Subqueries are very hard for the MySQL engine to optimize. Try:
SELECT
audit.MessageID, COUNT(alerts.ID) AS AlertCount
FROM
audit
LEFT JOIN alerts ON alerts.MessageID = audit.MessageID
GROUP BY audit.MessageID
Upvotes: 1