Reputation: 17831
Given the following table structure
images data
------------------ ----------------------------------
| id | filename | | fromImageId | toImageId | result |
------------------ ----------------------------------
I have the following SELECT
to get all images.filename
that do not have an entry in data
(either fromImageId
or toImageId
)
SELECT image.id, image.filename
FROM images image
WHERE NOT EXISTS(
SELECT fromImageId, toImageId
FROM data results
WHERE fromImageId = image.id
OR toImageId = image.id
) ORDER BY image.id
As the data
table is quite large (500,000+) this query takes a very long time to compute (around 10-15 seconds).
I'm very sure there is much optimization to be done here, but I can't quite get my head around what I could do to optimize the query.
PS: images.id
is a PRIMARY KEY
and both fromImageId
and toImageId
are FOREIGN KEY
s on images.id
Upvotes: 0
Views: 110
Reputation: 12806
SELECT
images.id,
filename
FROM
images
LEFT JOIN `data` ON images.id = fromImageId
OR images.id = toImageId
WHERE
fromImageId IS NULL AND toImageId IS NULL
And ensure indexes are on fromImageId
and toImageId
.
Upvotes: 4
Reputation: 48139
ensure your "Data" table has AT LEAST two indexes... One on just the FROM image id, the other on To Image ID. Then, slightly different than Michael was proposing
select STRAIGHT_JOIN
i.ID,
i.FileName
from
Images i
LEFT JOIN Data d1
on i.ID = d1.FromImageID
LEFT JOIN Data d2
on i.ID = d2.ToImageID
where
d1.FromImageID is null
AND d2.ToImageID is null
With the two individual indexes, this query starts with you image file and gets lined-up with TWO versions of your Data table... respectively joined by EITHER from or to image values AT THE SAME TIME. So now, it should just blow through and kick out only those entries where BOTH "Data" tables DONT find a match.
Upvotes: 2
Reputation: 402
Another way I can think of forming this query is:
SELECT image.id, image.filename
FROM images image
WHERE image.id NOT IN(
SELECT fromImageId, toImageId
FROM data results
) ORDER BY image.id
don't really know about the "optimization" it will do, but maybe you should think of making this a Stored Procedure.
Upvotes: 0
Reputation: 6817
NOT IN might be a better choice. Haven't tested it, but please try the following
SELECT image.id, image.filename
FROM images image
WHERE image.id NOT IN(
SELECT IFNULL(fromImageId, toImageId)
FROM data results
WHERE fromImageId = image.id
OR toImageId = image.id
) ORDER BY image.id
Upvotes: 0