duplicates with condition

Question

I would like to get the number of duplicates for article_id for each merchant_id, where the zip_code is not identical. Please see example below:

Table

merchant_id     article_id   zip_code 
1               4555         1000
1               4555         1003
1               4555         1002
1               3029         1000
2               7539         1005
2               7539         1005
2               7539         1002
2               1232         1006
3               5555         1000
3               5555         1001
3               5555         1002
3               5555         1003

Output Table

merchant_id     count_duplicate
1                3
2                2
3                4

This is the query that I am currently using but I am struggling to include the zip_code condition:

SELECT merchant_id
       ,duplicate_count
FROM main_table mt 
JOIN(select article_id, count(*) AS duplicate_count
     from main_table
     group by article_id
     having count(article_id) >1) mt_1
ON mt.article_id ON mt_1.article_id = mt.article_id

Gordon Linoff · Accepted Answer

If I understand correctly, you can use two levels of aggregation:

SELECT merchant_id, SUM(num_zips)
FROM (SELECT merchant_id, article_id, COUNT(DISTINCT zip_code) AS num_zips
      FROM main_table
      GROUP BY merchant_id, article_id
     ) ma
WHERE ma.num_zips > 1
GROUP BY merchant_id;

duplicates with condition

Answers (1)

Related Questions