Reputation: 77
I'm having trouble getting the count of customers that bought Item A and B . Below is a sample of the data I'm using.
Customer No | Item
___________________
1 A
1 B
2 B
3 A
4 A
4 B
5 B
6 A
I'm trying to get a count of how many customers bought Item A and B. This is what i tried so far. and I'm getting the following results Item A = 5, Item B = 3 and Item A and B = 6.
Select
count (distinct case when ItemNo = 'A' then customerNo end) as [A],
count (distinct case when ItemNo = 'B' then customerNo end) as [B],
count (distinct case when (ItemNo = 'A' or ItemNo = 'B') then customerNo end) as [AandB]
from Items
This is the result I'm trying to get:
Item | Count
A 4
B 4
A and B 2
Can someone please point me to the right direction to get this result. Thanks!
Upvotes: 2
Views: 401
Reputation: 61
select count (distinct case when I.ItemNo = 'A' then customerNo end) as A,
count (distinct case when I.ItemNo = 'B' then customerNo end) as B,
count (distinct case when (I.ItemNo = 'A' and (select count(*) from items as I2 where I2.itemno = 'B' and I.customerNo = I2.customerNo ) > 0)
then customerNo end) as AB
from Items as I;
Instead of using an 'OR' condition in the last COUNT, you want an 'AND'. Here, I'm using a correlated subquery to count the number of rows where the same customer (I.customerNo = I2.customerNo) ordered itemno 'B' (I2.itemno = 'B'). If the count is > 0, then the customer ordered item A and B.
The correlated subquery is established by the I.customerNo = I2.customerNo condition.
This is a longer way that works also if you cannot do a subquery in the COUNT
SELECT SUM(D.A), SUM(D.B), SUM(D.AB) FROM
(
select A, B, case when U.A > 0 AND U.B > 0 THEN 1 end AS AB
from (
select count (distinct case when I.ItemNo = 'A' then customerNo end) as A,
count (distinct case when I.ItemNo = 'B' then customerNo end) as B
from Items as I
group by I.customerNo
) AS U
) AS D;
Upvotes: 1
Reputation: 1270091
I would do this with two levels of aggregation:
select sum(has_a) as num_A,
sum(has_b) as num_B,
sum(has_a * has_b) as num_AB
from (select i.customer,
max(case when item_no = 'A' then 1 else 0 end) as has_A,
max(case when item_no = 'B' then 1 else 0 end) as has_B
from items
group by i.customer
) ic;
You can also get this information on different rows:
select has_a, has_b, count(*)
from (select i.customer,
max(case when item_no = 'A' then 1 else 0 end) as has_A,
max(case when item_no = 'B' then 1 else 0 end) as has_B
from items
group by i.customer
) ic
group by has_A, has_B;
This is not exactly the same, because the values are for only the combination of products. In other words, each customer is counted only once. What I like is:
Upvotes: 0
Reputation: 32695
Here is one more way to do it.
SQL Server has a useful operator INTERSECT
, which is exactly what you need here - an intersection of two sets (those who bought A
and those who bought B
).
I think it is more readable than obscure grouping with a HAVING
filter.
Performance-wise, you should try all variants on your real data.
Sample data
DECLARE @T TABLE (CustomerNo int, Item varchar(50));
INSERT INTO @T (CustomerNo, Item) VALUES
(1, 'A'),
(1, 'B'),
(2, 'B'),
(3, 'A'),
(4, 'A'),
(4, 'B'),
(5, 'B'),
(6, 'A');
Query
SELECT
Item
,COUNT(DISTINCT CustomerNo) AS CustomerCount
,0 AS SortOrder
FROM @T
GROUP BY Item
UNION ALL
SELECT
'A & B' AS Item
,COUNT(*) AS CustomerCount
,1 AS SortOrder
FROM
(
SELECT CustomerNo
FROM @T
WHERE Item = 'A'
INTERSECT
SELECT CustomerNo
FROM @T
WHERE Item = 'B'
) AS T
ORDER BY SortOrder, Item
;
First part of the query counts customer with simple grouping of them by the Item
.
The second part of the query (after UNION ALL
) counts those customers that bought both A
and B
.
SortOrder
column is just to order the final result appropriately.
Result
+-------+---------------+-----------+
| Item | CustomerCount | SortOrder |
+-------+---------------+-----------+
| A | 4 | 0 |
| B | 4 | 0 |
| A & B | 2 | 1 |
+-------+---------------+-----------+
Upvotes: 2
Reputation: 46229
I think your expected result might be A = 4
and B = 4
.
You can try to use UNION ALL
to get A
,B
and A & B
amount.
SELECT ItemNo,COUNT(distinct customerNo) Count
FROM Items
GROUP BY ItemNo
UNION ALL
SELECT 'A and B',count(*)
FROM (
SELECT COUNT(DISTINCT ItemNo) cnt
FROM Items tt
WHERE ItemNo IN ('A','B')
GROUP BY [CustomerNo]
HAVING COUNT(DISTINCT ItemNo) = 2
) t1
Upvotes: 2
Reputation: 521674
Here is one approach. We can try first aggregating by customer to generate the A and B counts. Then, take aggregates of those counts.
WITH cte AS (
SELECT customerNo,
COUNT(CASE WHEN Item = 'A' THEN 1 END) AS a_cnt,
COUNT(CASE WHEN Item = 'B' THEN 1 END) AS b_cnt
FROM Items
GROUP BY customerNo
)
SELECT 'A' AS Item, COUNT(CASE WHEN a_cnt > 0 THEN 1 END) AS Count, 0 AS pos FROM cte
UNION ALL
SELECT 'B', COUNT(CASE WHEN b_cnt > 0 THEN 1 END), 1 FROM cte
UNION ALL
SELECT 'A and B', COUNT(CASE WHEN a_cnt > 0 AND b_cnt > 0 THEN 1 END), 2 FROM cte
ORDER BY pos;
Upvotes: 0