Blazes
Blazes

Reputation: 4779

MySQL JOIN extremely poor performance

I've been messing around all day trying to find why my query performance is terrible. It is extremely simple, yet can take over 15 minutes to execute (I abort the query at that stage). I am joining a table with over 2 million records.

This is the select:

SELECT
    audit.MessageID, alerts.AlertCount
FROM
    audit
LEFT JOIN (
        SELECT MessageID, COUNT(ID) AS 'AlertCount'
        FROM alerts
        GROUP BY MessageID
    ) AS alerts ON alerts.MessageID = audit.MessageID

This is the EXPLAIN

| id | select_type | table      | type  | possible_keys | key                  | key_len | ref  | rows    | filtered | Extra       |
|  1 | PRIMARY     | AL         | index | NULL          | IDX_audit_MessageID  | 4       | NULL | 2330944 |   100.00 | Using index |
|  1 | PRIMARY     | <derived2> | ALL   | NULL          | NULL                 | NULL    | NULL |  124140 |   100.00 |             |
|  2 | DERIVED     | alerts     | index | NULL          | IDX_alerts_MessageID | 5       | NULL |  124675 |   100.00 | Using index |

This is the schema:

# Not joining, just showing types
CREATE TABLE messages (
    ID                  int NOT NULL AUTO_INCREMENT,
    MessageID           varchar(255) NOT NULL,
    PRIMARY KEY (ID),
    INDEX IDX_messages_MessageID (MessageID)
);

# 2,324,931 records
CREATE TABLE audit (
    ID                  int NOT NULL AUTO_INCREMENT,
    MessageID           int NOT NULL,
    LogTimestamp        timestamp NOT NULL,
    PRIMARY KEY (ID),
    INDEX IDX_audit_MessageID (MessageID),
    CONSTRAINT FK_audit_MessageID FOREIGN KEY(MessageID) REFERENCES messages(ID)
);

# 124,140
CREATE TABLE alerts (
    ID                  int NOT NULL AUTO_INCREMENT,
    AlertLevel          int NOT NULL,
    Text                nvarchar(4096) DEFAULT NULL,
    MessageID           int DEFAULT 0,
    PRIMARY KEY (ID),
    INDEX IDX_alert_MessageID (MessageID),
    CONSTRAINT FK_alert_MessageID FOREIGN KEY(MessageID) REFERENCES messages(ID)
);

A few very important things to note - the MessageID is not 1:1 in either 'audit' or 'alerts'; The MessageID can exist in one table, but not the other, or may exist in both (which is the purpose of my join); In my test DB, none of the MessageID exist in both. In other words, my query will return 2.3 million records with 0 as the count.

Another thing to note is that the 'audit' and 'alert' tables used to use MessageID as varchar(255). I created the 'messages' table expecting that it would fix the join. It actually made it worse. Previously, it would take 78 seconds, now, it never returns.

What am I missing about MySQL?

Upvotes: 1

Views: 136

Answers (2)

Borealid
Borealid

Reputation: 98459

You're joining to a subquery.

The subquery results are effectively a temporary table - note the <derived2> in the query execution plan. As you can see there, they're not indexed, since they're ephemeral.

You should execute the query as a single unit with a join, rather than joining to the results of a second query.

EDIT: Andrew has posted an answer with one example of how to do your work in a normal join query, instead of in two steps.

Upvotes: 1

Andrew
Andrew

Reputation: 4624

Subqueries are very hard for the MySQL engine to optimize. Try:

SELECT
    audit.MessageID, COUNT(alerts.ID) AS AlertCount
FROM
    audit
LEFT JOIN alerts ON alerts.MessageID = audit.MessageID
GROUP BY audit.MessageID

Upvotes: 1

Related Questions