Reputation: 21
I have a SQLite table in which I would like to filter customers which have bought 2 or more DISTINCT items. The table looks like this:
customer | product | quantity | date of purchase
---------------------------------------------
John | coffee | 1 | 02.20.2019
Bob | coffee | 2 | 01.13.2018
Chris | milk | 1 | 02.03.2012
Alice | bread | 1 | 05.01.2019
John | banana | 2 | 08.04.2015
Mary | orange | 3 | 12.16.2018
Mary | orange | 2 | 04.21.2019
Chris | pie | 1 | 07.02.2004
I've managed to select customers by grouping them using two rows in the GROUP BY query (customer and product):
SELECT * FROM test_table GROUP BY customer, product
This (almost) gave me all the customers that I'd originally wanted, except it also included Mary, which is not the desired output (she didn't buy two different products). I've tried to use a COUNT DISTINCT function to only show customers which have bought two or more different products like so:
SELECT customer, product, COUNT(DISTINCT(product)) AS counter FROM test_table
GROUP BY customer, product HAVING counter>1
The query successfully returned Chris and John without Mary, however it only outputs them once, showing only one of the products that they've bought. Is there a way in which I could get Chris and John with all distinct products that they've bought? The expected output of the query:
John | coffee | 1 | 02.20.2019
John | banana | 2 | 08.04.2015
Chris | milk | 1 | 02.03.2012
Chris | pie | 1 | 07.02.2004
Upvotes: 2
Views: 1740
Reputation: 164089
You can use EXISTS
to get all the rows:
SELECT t.*
FROM test_table t
WHERE EXISTS (SELECT 1 FROM test_table WHERE customer = t.customer AND product <> t.product)
Result:
> customer | product | quantity | date_of_purchase
> :------- | :------ | -------: | :---------------
> John | coffee | 1 | 02.20.2019
> Chris | milk | 1 | 02.03.2012
> John | banana | 2 | 08.04.2015
> Chris | pie | 1 | 07.02.2004
Or if you want 1 row for each customer
with the products and dates as comma separated lists you can use GROUP_CONCAT()
:
SELECT customer,
GROUP_CONCAT(product) products,
GROUP_CONCAT(date_of_purchase) dates
FROM test_table
GROUP BY customer
HAVING COUNT(DISTINCT product) > 1
Result:
> customer | products | dates
> :------- | :------------ | :--------------------
> Chris | milk,pie | 02.03.2012,07.02.2004
> John | coffee,banana | 02.20.2019,08.04.2015
See the demo.
Upvotes: 3
Reputation: 521239
Using aggregation:
SELECT *
FROM test_table
WHERE customer IN (SELECT customer
FROM test_table
GROUP BY customer
HAVING MIN(product) <> MAX(product));
The HAVING
clause above asserts that the "minimum" and "maximum" products are not the same, which implies that there be at least two distinct products associated with each matching customer.
For a more sophisticated solution, we could use analytic functions here:
WITH cte AS (
SELECT *, MIN(product) OVER (PARTITION BY customer) AS min_product,
MAX(product) OVER (PARTITION BY customer) AS max_product
FROM test_table
)
SELECT customer, product, quantity, date_of_purchase
FROM cte
WHERE min_product <> max_product;
Upvotes: 2