Parul
Parul

Reputation: 7

Need to tune a mysql query

I have a query which is taking more than a day to complete with 1700128 records in the temp_message_split table, so please help in getting this tunned

Create table statement and explain plan has been provided below.

UPDATE TEMP_MESSAGE_SPLIT t1 , TEMP_MESSAGE_SPLIT t2
    SET t1.STATUS = 'D'
    WHERE
    (t1.temp_message_split_key < t2.temp_message_split_key AND t1.DH_MEMBER_ID = t2.DH_MEMBER_ID)
    AND  nullif(t1.dh_member_id,'') IS NOT NULL;

Here is create table DDL

CREATE TABLE
temp_message_split
(
    FIRST_NAME VARCHAR(20),
    LAST_NAME VARCHAR(30),
    DOB VARCHAR(10),
    EMPLOYEE_ID VARCHAR(20),
    CES_CUST_NUM VARCHAR(7),
    MED_POLICY_NUM VARCHAR(20),
    EMAIL_ADDR VARCHAR(50),
    DH_MEMBER_ID VARCHAR(9),
    ALT_ID VARCHAR(20),
    DRSN VARCHAR(2),
    SSN VARCHAR(9),
    EPIPHANY_MEMBER_ID VARCHAR(18),
    PORTAL_ADDRESS VARCHAR(30),
    STATEMENT_VENDOR VARCHAR(20),
    CONTENT_KEY VARCHAR(18),
    EPIPHANY_COMMUNICATION_ID VARCHAR(200),
    PRIORITY VARCHAR(4),
    DAYS_UNTIL_EXPIRED VARCHAR(4),
    CONTENT_DTL_KEY VARCHAR(18),
    STATUS VARCHAR(1),
    ACTIVATION_MEMBER_KEY bigint,
    MESSAGE_BOARD_KEY bigint,
    PORTAL_STATEMENT_LOC_KEY bigint,
    temp_message_split_KEY bigint NOT NULL AUTO_INCREMENT,
    PRIMARY KEY (temp_message_split_KEY),
    INDEX EPIPHANY_COMMUNICATION_ID_IDX (EPIPHANY_COMMUNICATION_ID),
    INDEX TEMP_MESSAGE_SPLIT_IDX2 (ALT_ID),
    INDEX TEMP_MESSAGE_SPLIT_IDX3 (DRSN),
    INDEX TEMP_MESSAGE_SPLIT_IDX4 (ALT_ID, DRSN),
    INDEX TEMP_MESSAGE_SPLIT_IDX1 (DH_MEMBER_ID)
)
ENGINE=InnoDB DEFAULT CHARSET=utf8;

Here is its explain plan:

+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys                   | key                     | key_len | ref  | rows    | filtered | Extra                                          |
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
|  1 | SIMPLE      | t2    | NULL       | index | PRIMARY,TEMP_MESSAGE_SPLIT_IDX1 | TEMP_MESSAGE_SPLIT_IDX1 | 30      | NULL | 1619639 |   100.00 | Using index                                    |
|  1 | UPDATE      | t1    | NULL       | ALL   | PRIMARY,TEMP_MESSAGE_SPLIT_IDX1 | NULL                    | NULL    | NULL | 1619639 |    33.33 | Range checked for each record (index map: 0x5) |
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
2 rows in set (0.00 sec)

This query is taking more than a day for processing 1700128 in the temp_message_split table and we need to tune it in such a way that it will take as much min. time as possible.

Upvotes: 0

Views: 65

Answers (2)

Parul
Parul

Reputation: 7

Thank you Gordan for your last comment. I have updated the query based on this and now it is getting completed within seconds. Here is my new query:

UPDATE TEMP_MESSAGE_SPLIT t1 , TEMP_MESSAGE_SPLIT t2
    SET t1.STATUS = 'D'
    WHERE
    (t1.temp_message_split_key < t2.temp_message_split_key AND t1.DH_MEMBER_ID = t2.DH_MEMBER_ID)
    AND t1.DH_MEMBER_ID <> ''

Would you be able to help me understand how such minor change in query caused such a huge difference in query performance.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269683

My best guess is that you want to set the status to D for all but the highest value of temp_message_split_key for each DH_MEMBER_ID.

The best solution is NOT EXISTS, but MySQL doesn't support NOT EXISTS on the same table in an UPDATE query.

So, another method uses GROUP BY:

UPDATE TEMP_MESSAGE_SPLIT t1 JOIN
       (SELECT t2.DH_MEMBER_ID, MAX(t2.temp_message_split_key) as max_temp_message_split_key
        FROM TEMP_MESSAGE_SPLIT t2
        GROUP BY t2.DH_MEMBER_ID
       ) t2
       ON t1.DH_MEMBER_ID = t2.DH_MEMBER_ID AND
          t1.temp_message_split_key < t2.max_temp_message_split_key
    SET t1.STATUS = 'D';

An index on (dh_member_id, temp_message_split_key) may help performance.

This will still take a long time, because you are (presumably) updating a lot of rows. If possible, then it is probably simpler just to create a new table with the values you want. That will be much faster (due to logging and locking).

The NULLIF() is probably doing nothing, but it has a minimal affect on the performance of your query. It would be better written as t1.dh_member_id <> ''.

Upvotes: 1

Related Questions