Ronald
Ronald

Reputation: 557

SQL Grouping with No Duplicates

Here is the output. No problem here. Exactly what I want. But I added the DISTINCT ID to remove duplicates and that works in each grouped month.

MN   | CNT
====================
1    |  1
10   |  2
11   |  5
12   |  5

SELECT EXTRACT(MONTH FROM TRUNC(HDATE)) as MN, COUNT(DISTINCT ID) as CNT
    FROM Schema.TRAVEL
        WHERE (ARR = '2' OR ARR = '3')
        AND
        HDATE BETWEEN to_date('2015-10-01', 'yyyy-mm-dd') AND to_date('2016-09-30', 'yyyy-mm-dd')
GROUP BY EXTRACT(MONTH FROM TRUNC(HDATE));

But I can still possibly have duplicates that span more than each month. So if I have a record in October and another in November with the same ID - I want to only count this once - that is my issue

So over the course of a year or any time period - an ID only gets counted once...but I still need to maintain the monthly groupings and output...

??

Upvotes: 0

Views: 53

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269623

In other words, you want to count each id in the first month where it appears.

SELECT EXTRACT(MONTH FROM TRUNC(HDATE)) as MN, COUNT(DISTINCT ID) as CNT
FROM (SELECT id, MIN(HDATE) as HDATE
      FROM Schema.TRAVEL t
      WHERE ARR IN  '2', '3') AND
            HDATE BETWEEN  DATE '2015-10-01' AND DATE '2016-09-30'
      GROUP BY id
     ) t
GROUP BY EXTRACT(MONTH FROM TRUNC(HDATE));

Note: If an id appears before '2015-10-01', this will still count the id in the first month it appears after that date. If you don't want such an id counted at all, move the HDATE comparison to the outer query.

Upvotes: 1

Related Questions