Reputation: 51
I have a address table which is referenced from 6 other tables (sometimes multiple tables). Some of those tables have around half a million records (and the address table around 750000 records). I want to have a periodical query running which deletes all records that are not referenced from any of the tables.
The following sub-queries is not a option, because the query never finishes - the scope is too big.
delete from address where address_id not in (select ...)
and not in (select ...) and not in (select ...) ...
What I was hoping was that I could use the foreign key constraint and I could simply delete all records for which the foreign key constraint does not stop me (because there is no reference to the table). I could not find a way to do this (or is there?). Anybody another good idea to tackle this problem?
Upvotes: 5
Views: 3447
Reputation: 2230
I always use this:
DELETE FROM table WHERE id NOT IN (SELECT id FROM OTHER table)
Upvotes: 1
Reputation: 38502
You can try this ways
DELETE
address
FROM
address
LEFT JOIN other_table ON (address.id = other_table.ref_field)
LEFT JOIN other_table ON (address.id = other_table2.ref_field)
WHERE
other_table.id IS NULL AND other_table2.id IS NULL
OR
DELETE
FROM address A
WHERE NOT EXISTS (
SELECT 1
FROM other_table B
WHERE B.a_key = A.id
)
Upvotes: 3
Reputation: 996
I'd do this by first creating a TEMPORARY TABLE (t) that is a UNION of the IDs in the 6 referencing tables, then run:
DELETE x FROM x LEFT JOIN t USING (ID) WHERE x.ID IS NULL;
Where x is the address table.
See 'Multiple-table syntax' here: http://dev.mysql.com/doc/refman/5.0/en/delete.html
Obviously, your temporary table should have its PRIMARY KEY on ID. It may take some time to query and join, but I can't see a way round it. It should be optimized, unlike the multiple sub-query version.
Upvotes: 0