Reputation: 696
I would like to get the number of duplicates for article_id
for each merchant_id
, where the zip_code
is not identical. Please see example below:
Table
merchant_id article_id zip_code
1 4555 1000
1 4555 1003
1 4555 1002
1 3029 1000
2 7539 1005
2 7539 1005
2 7539 1002
2 1232 1006
3 5555 1000
3 5555 1001
3 5555 1002
3 5555 1003
Output Table
merchant_id count_duplicate
1 3
2 2
3 4
This is the query that I am currently using but I am struggling to include the zip_code condition:
SELECT merchant_id
,duplicate_count
FROM main_table mt
JOIN(select article_id, count(*) AS duplicate_count
from main_table
group by article_id
having count(article_id) >1) mt_1
ON mt.article_id ON mt_1.article_id = mt.article_id
Upvotes: 0
Views: 35
Reputation: 1269913
If I understand correctly, you can use two levels of aggregation:
SELECT merchant_id, SUM(num_zips)
FROM (SELECT merchant_id, article_id, COUNT(DISTINCT zip_code) AS num_zips
FROM main_table
GROUP BY merchant_id, article_id
) ma
WHERE ma.num_zips > 1
GROUP BY merchant_id;
Upvotes: 1