Reputation: 41
I've spent ages researching this and found nothing that directly addresses the question.
I’m trying to see which customers are ‘like’ other customers at an aggregate level.
How do I count the number of duplicate values on one column for certain values of another column?
In the example below, I’d like to know how many records for ‘Customer 112’ and ‘Customer 113’ have duplicate variable values to those shown for “Customer 111”.
The answer is: Customer 112 = 3 (27, 28 and 30 are all duplicates of values shown for Customer 111)
and Customer 113 = 2 (24 and 26 are both duplicates of values shown for Customer 111)
.
Customer Variable
111 21
111 22
111 23
111 24
111 26
111 27
111 28
111 29
111 30
112 23
112 27
112 28
112 30
112 31
112 33
112 35
113 24
113 26
113 33
113 35
The output would be:
Customer Count
112 3
113 2
Any suggestions would be greatly appreciated.
Upvotes: 1
Views: 133
Reputation: 1271231
Here is one way to do it, by joining to the "111" customer values and then aggregating:
select t.customer, count(t111.variable) as "count"
from t left outer join
(select t.*
from t
where customer = 111
) t111
on t.variable = t111.variable
group by t.customer;
I think the above is clear on what it is doing. However, you can eliminate the subquery (which is nice in MySQL):
select t.customer, count(t111.variable) as "count"
from t left outer join
t t111
on t.variable = t111.variable and t111.customer = 111
group by t.customer;
Upvotes: 3
Reputation: 7240
This would give you:
Customer Count
112 3
113 2
Here is the code:
SELECT customer,count(variable)
FROM t where variable in
(select variable from t where customer=111)
GROUP BY customer
HAVING customer!=111;
See it in http://www.sqlfiddle.com/#!2/3b3bc/15
Upvotes: 1