Reputation: 135
I'm using MySQL 5.0, and I need to fine tune this query. Can anyone please tell me what tuning I can do in this?
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
Upvotes: 0
Views: 86
Reputation: 126035
The first thing that I'd do is rewrite the subqueries as joins:
SELECT h.alert_master_id
FROM alert_appln_header h
JOIN schedule_config c
ON c.schedule_name = 'Purging_Config'
LEFT JOIN alert_details d
ON d.alert_master_id = h.alert_master_id
AND d.end_date IS NULL
AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
LEFT JOIN (
alert_sara_header s
JOIN alert_sara_lines l
ON l.alert_sara_master_id = s.sara_master_id
)
ON s.alert_master_id = h.alert_master_id
AND s.end_date IS NULL
AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
WHERE h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
AND d.alert_master_id IS NULL
AND s.alert_master_id IS NULL
GROUP BY h.alert_master_id
LIMIT 5000
If it's still slow after that, re-examine your indexing strategy. I'd suggest indexes over:
alert_appln_header(alert_master_id,created_date)
schedule_config(schedule_name)
alert_details(alert_master_id,end_date,created_date)
alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
alert_sara_lines(alert_sara_master_id)
Upvotes: 4
Reputation: 5662
OK, this may be just a shot in the dark, but I think you don't need as many DISTINCT
here.
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
-- removed distinct here --
SELECT alert_master_id FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
-- removed distinct here --
SELECT alert_master_id FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL)
AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
Since using the DISTINCT
is very costly, try to avoid it. In the first WHERE
clause you are checking for ids
that are NOT
within some result, so it shouldn't matter if in that result some ids
appear more than once.
Upvotes: 1