Jim
Jim

Reputation: 22656

SQL Selecting rows and excluding children

I have two tables A and B. A is the parent of B. I'm trying to find all As which do not have a specific B as a child. Ordinarily I would do

SELECT A.id FROM A 
    WHERE A.id NOT IN 
        (SELECT B.AId FROM B WHERE B.someFK = foo);

However for performance reasons I'm keen not to use an inner select. I've tried something like:

SELECT A.id FROM A 
    LEFT JOIN B ON (A.id = B.AId)
    WHERE B.someFK != foo OR B.someFK IS NULL

The problem is this returns As which have more than one child regardless of if they have the specified B.

EDIT: Changed B.id to B.someFK

Upvotes: 6

Views: 837

Answers (5)

Gidon Wise
Gidon Wise

Reputation: 1916

You should put the entire criteria in the LEFT JOIN.

As far as the duplicate rows are concerned, just DISTINCT it.

SELECT DISTINCT A.id 
FROM A 
LEFT JOIN B 
ON A.id = B.AId AND B.someFK = "foo"
WHERE B.AId IS NULL

Upvotes: 0

Null Pointer
Null Pointer

Reputation: 9289

Try this

SELECT DISTINCT A.id FROM A 
LEFT JOIN B ON (A.id = B.AId)
WHERE B.id != foo OR B.id IS NULL

Upvotes: 0

DMA57361
DMA57361

Reputation: 3690

Your LEFT JOIN will return one row for each A-B link (or one row for an A with no link to B), and then removes the links that fail to match your criteria - leaving behind every other link that a particular A has in B, which is not what you want.

However, I think an inner select is necessary.
But maybe try an EXISTS instead:

SELECT A.id
FROM A 
WHERE NOT EXISTS (SELECT * FROM B WHERE B.AId = A.id AND B.someFK = foo);

Upvotes: 1

Falcon
Falcon

Reputation: 3160

I'd go with the exists clause, since it was made for such a purpose:

SELECT A.id FROM A WHERE NOT EXISTS (SELECT 1 FROM B WHERE B.id = foo and B.AId=a.id);

In most databases it's far more performant than the in clause (which basically is an array to compare against) for large record sets.

Aggregations are also expensive, so using an exists statement is the way to go imho. You might try aggregations for your scenario, though.

Upvotes: 2

amarsuperstar
amarsuperstar

Reputation: 1783

Try using a MAX constraint on your parent table. I have not used MySql in a while, but this should give you the idea

SELECT A.id, MAX(B.AId) FROM A 
    LEFT JOIN B ON (A.id = B.AId)
    WHERE B.id != foo OR B.id IS NULL
GROUP BY A.id

Upvotes: 0

Related Questions