F.P
F.P

Reputation: 17831

How can this MySQL query be optimized

Given the following table structure

images                 data
 ------------------    ----------------------------------
| id    | filename |  | fromImageId | toImageId | result |
 ------------------    ----------------------------------

I have the following SELECT to get all images.filename that do not have an entry in data (either fromImageId or toImageId)

SELECT image.id, image.filename
FROM images image
WHERE NOT EXISTS(
    SELECT fromImageId, toImageId
    FROM data results
    WHERE fromImageId = image.id
        OR toImageId = image.id
) ORDER BY image.id

As the data table is quite large (500,000+) this query takes a very long time to compute (around 10-15 seconds).

I'm very sure there is much optimization to be done here, but I can't quite get my head around what I could do to optimize the query.

PS: images.id is a PRIMARY KEY and both fromImageId and toImageId are FOREIGN KEYs on images.id

Upvotes: 0

Views: 110

Answers (4)

Michael
Michael

Reputation: 12806

SELECT
    images.id,
    filename
FROM
    images
LEFT JOIN `data` ON images.id = fromImageId
OR images.id = toImageId
WHERE
    fromImageId IS NULL AND toImageId IS NULL

And ensure indexes are on fromImageId and toImageId.

Upvotes: 4

DRapp
DRapp

Reputation: 48139

ensure your "Data" table has AT LEAST two indexes... One on just the FROM image id, the other on To Image ID. Then, slightly different than Michael was proposing

select STRAIGHT_JOIN
      i.ID,
      i.FileName
   from
      Images i
         LEFT JOIN Data d1
            on i.ID = d1.FromImageID
         LEFT JOIN Data d2
            on i.ID = d2.ToImageID
   where
          d1.FromImageID is null
      AND d2.ToImageID is null

With the two individual indexes, this query starts with you image file and gets lined-up with TWO versions of your Data table... respectively joined by EITHER from or to image values AT THE SAME TIME. So now, it should just blow through and kick out only those entries where BOTH "Data" tables DONT find a match.

Upvotes: 2

nbaztec
nbaztec

Reputation: 402

Another way I can think of forming this query is:

SELECT image.id, image.filename
FROM images image
WHERE image.id NOT IN(
    SELECT fromImageId, toImageId
    FROM data results
) ORDER BY image.id

don't really know about the "optimization" it will do, but maybe you should think of making this a Stored Procedure.

Upvotes: 0

Chetter Hummin
Chetter Hummin

Reputation: 6817

NOT IN might be a better choice. Haven't tested it, but please try the following

SELECT image.id, image.filename
FROM images image
WHERE image.id NOT IN(
    SELECT IFNULL(fromImageId, toImageId)
    FROM data results
    WHERE fromImageId = image.id
        OR toImageId = image.id
) ORDER BY image.id

Upvotes: 0

Related Questions