Reputation: 17402

Subqueries vs Multi Table Join

I've 3 tables A, B, C. I want to list the intersection count.

Way 1:-

select count(id) from A a join B b on a.id = b.id join C c on  B.id = C.id;

Result Count - X

Way 2:-

SELECT count(id) FROM A WHERE id IN (SELECT id FROM B WHERE id IN (SELECT id FROM C));

Result Count - Y

The result count in each of the query is different. What exactly is wrong?

Upvotes: 0

Answers (1)

Gordon Linoff

Reputation: 1270873

A JOIN can multiply the number of rows as well as filtering out rows.

In this case, the second count should be the correct one because nothing is double counted -- assuming id is unique in a. If not, it needs count(distinct a.id).

The equivalent using JOIN would use COUNT(DISTINCT):

select count(distinct a.id)
from A a join
     B b
     on a.id = b.id join
     C c
     on B.id = C.id;

I mention this for completeness but do not recommend this approach. Multiplying the number of rows just to remove them using distinct is inefficient.

In many databases, the most efficient method might be:

select count(*)
from a
where exists (select 1 from b where b.id = a.id) and
      exists (select 1 from c where c.id = a.id);

Note: This assumes there are indexes on the id columns and that id is unique in a.

Upvotes: 2

Subqueries vs Multi Table Join

Answers (1)

Related Questions