John Thomas
John Thomas

Reputation: 1105

Counting distinct ID based on date

So I have a table as follows:

ID   create_date
001   01/01/2021
002   02/04/2021
003   07/22/2021
004   01/29/2021
005   03/01/2021

ID is unique for the table.

I have another table (below) where these IDs appear multiple times alongside another variable, titled code_id.

ID  code_id  date        data
001       A  01/01/2021  xxx
002       W  02/08/2021  xxx
002       B  03/06/2021  xxx
001       A  01/19/2021  xxx
002       C  05/01/2021  xxx
004       D  12/01/2021  xxx
001       K  01/02/2021  xxx
001       J  01/15/2021  xxx
005       A  03/01/2021  xxx
005       A  03/01/2021  xxx
005       B  03/05/2021  xxx
005       B  03/30/2021  xxx
005       C  03/30/2021  xxx
005       D  04/01/2021  xxx

What I want to do is create a new table (preferably via CTE, but open to join options) which show the distinct count of code_id after both 5 and 30 days from table1.create_date.

So in other words, how many different code_id's appear for each ID after x days from create_date, where x is equal to 5 and 30 respectively.

Here is the resulting table I seek:

ID   distinct_code_id_5_day   distinct_code_id_30_day   distinct_code_id_total
001                       2                         3                        3
002                       1                         2                        3 
003                       0                         0                        0 
004                       0                         0                        1
005                       2                         3                        4

In the case of ID = 001,we show all code_id's that appeared from 01/01/2021 - 01/05/2021, inclusive for distinct_code_id_5_day and 01/01/2021 - 01/30/2021, inclusive for distinct_code_id_30_day.

Upvotes: 1

Views: 398

Answers (1)

Felipe Hoffa
Felipe Hoffa

Reputation: 59175

You should be able to solve this with a join and a couple iff() with date math:

with ids as (
select split(value, '   ') x, x[0] id, x[1]::date create_date
from table(split_to_table('001   01/01/2021
002   02/04/2021
003   07/22/2021
004   01/29/2021
005   03/01/2021', '\n'))
), data as(
select split(value, ' ') x, x[0] id, x[7] code_id, x[9]::date date, x[11] data
from table(split_to_table('001       A  01/01/2021  xxx
002       W  02/08/2021  xxx
002       B  03/06/2021  xxx
001       A  01/19/2021  xxx
002       C  05/01/2021  xxx
004       D  12/01/2021  xxx
001       K  01/02/2021  xxx
001       J  01/15/2021  xxx
005       A  03/01/2021  xxx
005       A  03/01/2021  xxx
005       B  03/05/2021  xxx
005       B  03/30/2021  xxx
005       C  03/30/2021  xxx
005       D  04/01/2021  xxx', '\n')))

select id, count(distinct code5), count(distinct code30), count(distinct code_id)
from (
    select a.id, iff(a.create_date + 5 >= b.date, b.code_id, null) code5
        , iff(a.create_date + 30 >= b.date, b.code_id, null) code30
        , b.code_id
    from ids a
    left outer join data b
    where a.id=b.id
)
group by 1

enter image description here

Upvotes: 1

Related Questions