user5410996
user5410996

Reputation: 25

count distinct and window functions

I have a list of IDs, transactions, the date of those transactions, and a category of those transactions. I want to create a count of each distinct category within each ID

The starting table I have is looks something like this:

id  trxn_dt     trxn_amt trxn_category
1   10/31/2014  58       apple
1   11/9/2014   34       banana
1   12/10/2014  12       apple
2   7/8/2014    78       banana
2   11/20/2014  99       banana
3   1/5/2014    120      orange
4   2/17/2014   588      apple
4   2/18/2014   8        banana
4   3/9/2014    65       orange
4   4/25/2014   74       apple

And I want the end result to look something like this:

id  trxn_dt     trxn_amt trxn_category  number_category
1   10/31/2014  58       apple          2
1   11/9/2014   34       banana         2 
1   12/10/2014  12       apple          2
2   7/8/2014    78       banana         1
2   11/20/2014  99       banana         1
3   1/5/2014    120      orange         1
4   2/17/2014   588      apple          3
4   2/18/2014   8        banana         3
4   3/9/2014    65       orange         3
4   4/25/2014   74       apple          3

I've tried using count(distinct(trxn_category)) over(partition by id,trxn_category order by id) as number_category but I get an error about using 'distinct'

Upvotes: 1

Views: 5404

Answers (3)

dnoeth
dnoeth

Reputation: 60462

Most DBMSes don't support DISTINCT in Windowed Functions, but you can emulate COUNT(DISTINCT) using two DENSE_RANKs:

DENSE_RANK() over (partition by id,trxn_category order by id ASC)-
DENSE_RANK() over (partition by id,trxn_category order by id DESC)

or a nested MAX(DENSE_RANK):

select
   MAX(dr) over (partition by id,trxn_category)
from
 (
   select 
      DENSE_RANK() over (partition by id,trxn_category order by id DESC) as dr
 ) 

Upvotes: 2

Matt
Matt

Reputation: 14341

;WITH cteCounts AS (
    SELECT
       id
       ,COUNT(DISTINCT trxn_category) as CategoryCount
    FROM
       table
    GROUP BY
       id
)


SELECT
    t.id
    ,trxn_dt
    ,trxn_amt
    ,trxn_category
    ,c.CategoryCount
FROM
    table t
    INNER JOIN cteCounts c
    ON t.id = c.id

Can't use DISTINCT in the partitioned aggregation.

Upvotes: 0

Giorgos Betsos
Giorgos Betsos

Reputation: 72175

You can use a correlated subquery for this:

SELECT id, trxn_dt, trxn_amt, trxn_category,
       (SELECT COUNT(DISTINCT trxn_category)
        FROM mytable AS t2
        WHERE t2.id = t1.id) AS cnt
FROM mytable AS t1

Demo here

Upvotes: 2

Related Questions