Count Distinct Duplicates Within Groups in MySql

Question

I have an SQL table called main_table:

id | product_id | purchase_id | purchaser_id
---+------------+-------------+-------------
 1 |          1 |           1 |            1
 2 |          1 |           2 |            1
 3 |          1 |           3 |            1
 4 |          1 |           4 |            2
 5 |          1 |           5 |            2
 6 |          1 |           6 |            3
 7 |          2 |           1 |            1
 8 |          2 |           4 |            2
 9 |          2 |           5 |            2
10 |          2 |           7 |            2
11 |          2 |           8 |            2
12 |          2 |           6 |            3
13 |          2 |           9 |            3
14 |          2 |          10 |            3
15 |          2 |          11 |            3
16 |          2 |          12 |            4
17 |          2 |          13 |            4

I need to group by product_id and find the four things:

# purchases
# purchasers
# repeat purchases
# repeat purchasers

So the first 3 are relatively simple...

SELECT FROM `main_table`
  product_id,
  COUNT(DISTINCT `purchase_id`) AS `purchases`,
  COUNT(DISTINCT `purchaser_id`) AS `purchasers`,
  (COUNT(DISTINCT `purchase_id`) - COUNT(DISTINCT `purchaser_id`)) AS `repeat_purchases`,
  (??????) AS `repeat_purchasers`
GROUP BY product_id
ORDER BY product_id ASC

What is the ?????? in order to get the following table:

product_id | purchases | purchasers | repeat_purchases | repeat_purchasers
-----------+-----------+------------+------------------+------------------
         1 |         6 |          3 |                3 |                2 
         2 |        11 |          4 |                7 |                3

John Woo · Accepted Answer

SELECT  a.product_id,
        COUNT(DISTINCT a.purchase_id) AS purchases,
        COUNT(DISTINCT a.purchaser_id) AS purchasers,
        (COUNT(DISTINCT a.purchase_id) - COUNT(DISTINCT a.purchaser_id)) AS repeat_purchases,
        COALESCE(c.totalCount,0) AS repeat_purchasers
FROM    main_table a
        LEFT JOIN
        (
            SELECT  product_id, COUNT(totalCOunt) totalCount
            FROM    
                    (
                        SELECT  product_id, purchaser_id, COUNT(*) totalCOunt
                        FROM    main_table
                        GROUP   BY product_id, purchaser_id
                        HAVING  COUNT(*) > 1
                    ) s
            GROUP   BY product_id
        ) c ON  a.product_id = c.product_id
GROUP   BY product_id

SQLFiddle Demo

OUTPUT

╔════════════╦═══════════╦════════════╦══════════════════╦═══════════════════╗
║ PRODUCT_ID ║ PURCHASES ║ PURCHASERS ║ REPEAT_PURCHASES ║ REPEAT_PURCHASERS ║
╠════════════╬═══════════╬════════════╬══════════════════╬═══════════════════╣
║          1 ║         6 ║          3 ║                3 ║                 2 ║
║          2 ║        11 ║          4 ║                7 ║                 3 ║
╚════════════╩═══════════╩════════════╩══════════════════╩═══════════════════╝

Count Distinct Duplicates Within Groups in MySql

Answers (2)

Related Questions