Reputation: 47
I'm trying to adapt the solutions here (SQL Delete Rows Based on Another Table) to my needs. E.g.,
DELETE
FROM complete_set
WHERE slice_name IN (SELECT slice_name FROM changes
GROUP BY slice_name HAVING COUNT(slice_name) > 1);
Tables definitions:
While running the query's components individually is extremely fast ...
E.g.,
SELECT slice_name
FROM changes
GROUP BY slice_name
HAVING COUNT(sym) > 1;
(off-the-cuff about a second), and
DELETE FROM complete_set
WHERE slice_name = 'ABC'
(also about a second, or so)
The above solution (w/ subquery) takes too long to execute be useful. Is there an optimization I can apply here?
Thanks for the assist.
Upvotes: 1
Views: 1110
Reputation: 21
If the table size is too big the above execution will definitely take lot of time because the inner query shall run for every outer query row during the deletion.
The deletion would be much quicker if all the individual deletion statement is defined separately and executed in a batch or sequentially.
Upvotes: 1
Reputation: 522161
One possible explanation for the slow delete is that takes some time for MySQL to lookup each slice_name
in the complete_set
table against the values in the subquery. We can try speeding this up as follows. First, create a new table to replace the subquery, which will serve as a materialized view:
CREATE TEMPORARY TABLE changes_view
(PRIMARY KEY pkey (slice_name))
SELECT slice_name
FROM changes
GROUP BY slice_name
HAVING COUNT(slice_name) > 1;
Now phrase your delete using a join:
DELETE t1
FROM complete_set t1
INNER JOIN changes_view t2
ON t1.slice_name = t2.slice_name;
The (intended) trick here is that the delete join should run fast because MySQL can quickly lookup a slice_name
value in the complete_set
table against the materialized view table, since the latter has an index on slice_name
.
Upvotes: 1