Nic
Nic

Reputation: 27

Select where x is true but not y

I have a table that stores customer transaction data. I want to select all customers that have purchased item 1 (64244), but remove/ignore them from the results once they've purchased item 2 (64270). The first half is pretty straightforward, it's the ignoring item 2 results I'm struggling with. Here's what I have so far that is not working.

SELECT CustomerID, Name, PurchaseDate
FROM Orders
WHERE ProductCode = '64244' 
and ProductCode NOT IN (SELECT ProductCode FROM Orders WHERE ProductCode = '64270')

Correct me if I'm wrong, but I think the subquery in my WHERE clause is effectively the same as

WHERE ProductCode = '64244' and not ProductCode = '64270'

Which also does not work.

I've also tried using EXIST and NOT EXIST to filter the item 2 sales to no avail.

EXIST seems to find the results I don't want, and NOT EXIST seems to remove all results.

WHERE ProductCode = '64244'
and EXISTS (SELECT ProductCode FROM Orders WHERE ProductCode = '64270')

Upvotes: 0

Views: 102

Answers (4)

Roger Wolf
Roger Wolf

Reputation: 7712

I want to select all customers

Since you are looking for customers' attributes and not for those of orders, that's how you query should look like:

select c.*
from Customers c
where exists (
  select 0 from Orders ord
  where ord.CustomerID = c.CustomerID
    and ord.ProductCode = '64244'
)
  and not exists (
  select 0 from Orders ord
  where ord.CustomerID = c.CustomerID
    and ord.ProductCode = '64270'
);

Despite the fact that the query reads the Orders table twice, it should be pretty fast, given the appropriate indices are in place.

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522509

The answer by @TimRoberts is canonical, but you could also use an aggregation here:

WITH cte AS (
    SELECT CustomerID
    FROM Orders
    GROUP BY CustomerID
    HAVING COUNT(CASE WHEN ProductCode = '64244' THEN 1 END) > 0 AND
           COUNT(CASE WHEN ProductCode = '64270' THEN 1 END) = 0
)

 SELECT CustomerID, Name, PurchaseDate
 FROM Orders
 WHERE CustomerID IN (SELECT CustomerID FROM cte);

The aggregation CTE above should benefit from an index on (CustomerID, ProductCode):

CREATE INDEX idx ON Orders (CustomerID, ProductCode);

Upvotes: 2

ValNik
ValNik

Reputation: 5916

Your 3-d query example can work with NOT EXISTS for this CustomerId

SELECT CustomerID, Name, PurchaseDate
FROM Orders o
WHERE ProductCode = '64244' 
   and NOT EXISTS (SELECT 1 FROM Orders o2
                   WHERE o2.ProductCode = '64270'
                    and o2.CustomerId=o.CustomerId)

Upvotes: 3

Tim Roberts
Tim Roberts

Reputation: 54767

Close. You don't want to check product codes, you want to check customers:

SELECT CustomerID, Name, PurchaseDate
FROM Orders
WHERE ProductCode = '64244' 
AND CustomerID NOT IN (SELECT CustomerID FROM Orders WHERE ProductCode = '64270')

So, members of the first set that are not members of the second set.

Upvotes: 4

Related Questions