Mitchell
Mitchell

Reputation: 169

Left join appears to be hindering SQL query performance tremendously

I have two tables:

Table A:

Name, isPair

Table B:

Name1, Name2, Status

This is my query:

SELECT Name
FROM A
LEFT JOIN B ON (A.Name = B.Name2)
WHERE A.isPair = 'T' AND (B.status <> 'valid' OR B.status IS NULL)

I have millions of rows in both tables. At the current queries speed it will take over 3 months to complete. I have indexed both tables appropriately. When I originally did an INNER JOIN it only took 10 minutes to complete, but I discovered the query wasn't returning rows that were not in Table B's Name2 column, which was a problem as I need them returned.

Upvotes: 2

Views: 623

Answers (1)

Nick.Mc
Nick.Mc

Reputation: 19194

This query might return the correct results faster, or it might return incorrect results with no speed improvement

This is all based on SQL Server knowledge but I assume InnoDB has the same characteristics.

SELECT Name 
FROM A
WHERE A.isPair = 'T'
AND NOT EXISTS (
    SELECT 1 FROM B 
    WHERE A.Name = B.Name2
    AND B.status = 'valid'
    )

I hope I have rearranged the Boolean logic correctly.

Before you were searching for records in A that had no match in B or a match with status <> valid

The new query returns records from A where it can't find a match in B with status = valid. Hopefully that is the same thing.

There are two database concepts I am using here:

  1. When using EXISTS it can just see if the table record exists, it doesn't have to join to the table and retrieve values from it..... most query planners do this automatically though so this is a long shot

  2. The operator <> is non sargable which means it can't utilise any index which contains the column status... i.e. it can't explicitly seek an the index for the absence of a value, it can only search an index for specific value(s). So I've changed it to = for this reason and also because it supports the NOT EXISTS logic

Again I don't know much about InnoDB but I'm sure if it didn't have these limitations the tricks would have been copied by Oracle and Microsoft already.

Comparing query plans will give you some idea of whether this rewrite makes any difference. ALso comparing query plans between the existing INNER and OUTER versions of your query might shed some light on things.

Upvotes: 3

Related Questions