sandboxj
sandboxj

Reputation: 1254

SQL EXISTS returns all rows, more than two tables

I know similar questions like this have been asked before, but I have not seen one for more than 2 tables. And there seems to be a difference.

I have three tables from which I need fields, customers where I need customerID and orderID from, orders from which I get customerID and orderID and lineitems from which I get orderID and quantity (= quantity ordered).

I want to find out how many customers bought more than 2 of the same item, so basically quantity > 2 with:

SELECT COUNT(DISTINCT custID) 
FROM customers
WHERE EXISTS(
    SELECT *
    FROM customers C, orders O, lineitems L
    WHERE C.custID = O.custID AND O.orderID = L.orderID AND L.quantity > 2
    );

I do not understand why it is returning me the count of all rows. I am correlating the subqueries before checking the >2 condition, am I not?

I am a beginner at SQL, so I'd be thankful if you could explain it to me fundamentally, if necessary. Thanks.

Upvotes: 0

Views: 1411

Answers (3)

Nicolas Arias
Nicolas Arias

Reputation: 60

I suggest you to use "joins" ,

Try this

select count(*) From orders o inner join lineitems l on l.orderID = o.orderID where l.quantity > 2

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269883

I would approach this as two aggregations:

select count(distinct customerid)
from (select o.customerid, l.itemid, count(*) as cnt
      from lineitems li join
           orders o
           on o.orderID = l.orderId
      group by o.customerid, l.itemid
     ) ol
where cnt >= 2;

The inner query counts the number of items that each customer has purchased. The outer counts the number of customers.

EDIT:

I may have misunderstood the question for the above answer. If you just want where quantity >= 2, then that is much easier:

select count(distinct o.customerid)
from lineitems li join
     orders o
     on o.orderID = l.orderId
where l.quantity >= 2;

This is probably the simplest way to express the query.

Upvotes: 1

Giorgos Betsos
Giorgos Betsos

Reputation: 72175

You don't have to repeat customers table in the EXISTS subquery. This is the idea of correlation: use the table of the outer query in order to correlate.

SELECT COUNT(DISTINCT custID) 
FROM customers c
WHERE EXISTS(
    SELECT *
    FROM orders O
    JOIN lineitems L ON O.orderID = L.orderID
    WHERE C.custID = O.custID AND L.quantity > 2
    );

Upvotes: 2

Related Questions