Ian
Ian

Reputation: 12241

MySQL Query to find customers who have ordered two specific products

I'm having trouble coming up with a query that will find all customers who have purchased both PROD1 and PROD2.

Here's a pseudo-query that kind of looks like what I want to do: (obviously this wouldn't work)

SELECT COUNT(DISTINCT userid) 
  FROM TRANSACTIONS 
 WHERE product_id = 'prod1' 
   AND product_id = 'prod2'

So basically I'm trying to get a count of the number of distinct userids that have a transaction in the transactions table for both product_id 'prod1' and 'prod2'. Each transaction is stored in a row in the transactions table.

Upvotes: 4

Views: 15548

Answers (6)

senz
senz

Reputation: 2158

Example for sakila db:

SELECT R.customer_id, GROUP_CONCAT(I.film_id) 
FROM sakila.rental R 
RIGHT OUTER JOIN sakila.inventory I ON R.inventory_id = I.inventory_id 
WHERE I.film_id IN (22,44) GROUP BY R.customer_id HAVING COUNT(*) = 2

Upvotes: 0

Bill Karwin
Bill Karwin

Reputation: 562230

I do this type of query in the following way:

SELECT COUNT(DISTINCT t1.userid) AS user_count
  FROM TRANSACTIONS t1
  JOIN TRANSACTIONS t2 USING (userid)
 WHERE t1.product_id = 'prod1' 
   AND t2.product_id = 'prod2';

The GROUP BY solution shown by @najmeddine also produces the answer you want, but it doesn't perform as well on MySQL. MySQL has a hard time optimizing GROUP BY queries.

You should try both queries, analyzing the optimization with EXPLAIN, and also run some tests and time the results given the volume of data in your database.

Upvotes: 6

chrisg
chrisg

Reputation: 318

SELECT COUNT(DISTINCT userId)
FROM(
  SELECT userId
  FROM transactions
  WHERE product = 'PROD1'
  INTERSECT
  SELECT userId
  FROM transactions
  WHERE product = 'PROD2');

The query creates two intermediate tables, one which contains userId of customer who bought PROD1 and another identical table for those who bought PROD2. The intersection operator returns a table which contains only rows found in both previous tables, i.e those who bought both products.

Upvotes: 0

Larry Lustig
Larry Lustig

Reputation: 50970

(Added new options below using the additional information provided by the user)

Try

SELECT * FROM Customers WHERE 
 EXISTS (SELECT * FROM Purchases WHERE ProductID = 'PROD1' AND CustID = Customers.CustID)
 AND
 EXISTS (SELECT * FROM Purchases WHERE ProductID = 'PROD2' AND CustID = Customers.CustID)

Or

SELECT * FROM Customers WHERE 
 CustID IN (SELECT CustID FROM Purchases WHERE ProductID = 'PROD1')
 AND
 CustID IN (SELECT CustID FROM Purchases WHERE ProductID = 'PROD2')

Or

SELECT UserID FROM Transactions WHERE ProductID = 'PROD1'
 AND EXISTS (SELECT * FROM Transactions WHERE UserID = T1.UserID 
  AND ProductID = 'PROD2')

Or

SELECT UserID FROM Transactions WHERE ProductID = 'PROD1'
 AND UserID IN (SELECT UserID FROM Transactions WHERE ProductID = 'PROD2')

Upvotes: 3

manji
manji

Reputation: 47968

SELECT userid
  FROM TRANSACTIONS
 WHERE product_id in ('prod1', 'prod2')
GROUP BY userid
HAVING COUNT(DISTINCT product_id) = 2

Upvotes: 6

iDevlop
iDevlop

Reputation: 25252

This is an Access answer based on the infamous Northwind sample db. You should be abe to translate that in mySql quite easily.

SELECT o.CustomerID, Sum([ProductID]='Prod1') AS Expr1, Sum([productid]='Prod1') AS Expr2
FROM Orders AS o INNER JOIN [Order Details] AS d ON o.OrderID = d.OrderID
GROUP BY o.CustomerID
HAVING (((Sum([ProductID]='Prod1'))<>0) AND ((Sum([productid]='Prod1'))<>0));

Upvotes: 0

Related Questions