Reputation: 7
I have a query which is taking more than a day to complete with 1700128 records in the temp_message_split table, so please help in getting this tunned
Create table statement and explain plan has been provided below.
UPDATE TEMP_MESSAGE_SPLIT t1 , TEMP_MESSAGE_SPLIT t2
SET t1.STATUS = 'D'
WHERE
(t1.temp_message_split_key < t2.temp_message_split_key AND t1.DH_MEMBER_ID = t2.DH_MEMBER_ID)
AND nullif(t1.dh_member_id,'') IS NOT NULL;
Here is create table DDL
CREATE TABLE
temp_message_split
(
FIRST_NAME VARCHAR(20),
LAST_NAME VARCHAR(30),
DOB VARCHAR(10),
EMPLOYEE_ID VARCHAR(20),
CES_CUST_NUM VARCHAR(7),
MED_POLICY_NUM VARCHAR(20),
EMAIL_ADDR VARCHAR(50),
DH_MEMBER_ID VARCHAR(9),
ALT_ID VARCHAR(20),
DRSN VARCHAR(2),
SSN VARCHAR(9),
EPIPHANY_MEMBER_ID VARCHAR(18),
PORTAL_ADDRESS VARCHAR(30),
STATEMENT_VENDOR VARCHAR(20),
CONTENT_KEY VARCHAR(18),
EPIPHANY_COMMUNICATION_ID VARCHAR(200),
PRIORITY VARCHAR(4),
DAYS_UNTIL_EXPIRED VARCHAR(4),
CONTENT_DTL_KEY VARCHAR(18),
STATUS VARCHAR(1),
ACTIVATION_MEMBER_KEY bigint,
MESSAGE_BOARD_KEY bigint,
PORTAL_STATEMENT_LOC_KEY bigint,
temp_message_split_KEY bigint NOT NULL AUTO_INCREMENT,
PRIMARY KEY (temp_message_split_KEY),
INDEX EPIPHANY_COMMUNICATION_ID_IDX (EPIPHANY_COMMUNICATION_ID),
INDEX TEMP_MESSAGE_SPLIT_IDX2 (ALT_ID),
INDEX TEMP_MESSAGE_SPLIT_IDX3 (DRSN),
INDEX TEMP_MESSAGE_SPLIT_IDX4 (ALT_ID, DRSN),
INDEX TEMP_MESSAGE_SPLIT_IDX1 (DH_MEMBER_ID)
)
ENGINE=InnoDB DEFAULT CHARSET=utf8;
Here is its explain plan:
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
| 1 | SIMPLE | t2 | NULL | index | PRIMARY,TEMP_MESSAGE_SPLIT_IDX1 | TEMP_MESSAGE_SPLIT_IDX1 | 30 | NULL | 1619639 | 100.00 | Using index |
| 1 | UPDATE | t1 | NULL | ALL | PRIMARY,TEMP_MESSAGE_SPLIT_IDX1 | NULL | NULL | NULL | 1619639 | 33.33 | Range checked for each record (index map: 0x5) |
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
2 rows in set (0.00 sec)
This query is taking more than a day for processing 1700128 in the temp_message_split table and we need to tune it in such a way that it will take as much min. time as possible.
Upvotes: 0
Views: 65
Reputation: 7
Thank you Gordan for your last comment. I have updated the query based on this and now it is getting completed within seconds. Here is my new query:
UPDATE TEMP_MESSAGE_SPLIT t1 , TEMP_MESSAGE_SPLIT t2
SET t1.STATUS = 'D'
WHERE
(t1.temp_message_split_key < t2.temp_message_split_key AND t1.DH_MEMBER_ID = t2.DH_MEMBER_ID)
AND t1.DH_MEMBER_ID <> ''
Would you be able to help me understand how such minor change in query caused such a huge difference in query performance.
Upvotes: 0
Reputation: 1269683
My best guess is that you want to set the status to D
for all but the highest value of temp_message_split_key
for each DH_MEMBER_ID
.
The best solution is NOT EXISTS
, but MySQL doesn't support NOT EXISTS
on the same table in an UPDATE
query.
So, another method uses GROUP BY
:
UPDATE TEMP_MESSAGE_SPLIT t1 JOIN
(SELECT t2.DH_MEMBER_ID, MAX(t2.temp_message_split_key) as max_temp_message_split_key
FROM TEMP_MESSAGE_SPLIT t2
GROUP BY t2.DH_MEMBER_ID
) t2
ON t1.DH_MEMBER_ID = t2.DH_MEMBER_ID AND
t1.temp_message_split_key < t2.max_temp_message_split_key
SET t1.STATUS = 'D';
An index on (dh_member_id, temp_message_split_key)
may help performance.
This will still take a long time, because you are (presumably) updating a lot of rows. If possible, then it is probably simpler just to create a new table with the values you want. That will be much faster (due to logging and locking).
The NULLIF()
is probably doing nothing, but it has a minimal affect on the performance of your query. It would be better written as t1.dh_member_id <> ''
.
Upvotes: 1