Reputation: 117
I have a SQL table in redshift that looks like:
ID | Date | Gender |
---|---|---|
A | 2019-02-01 | M |
B | 2019-02-01 | M |
A | 2019-01-01 | M |
C | 2019-03-01 | F |
B | 2019-03-01 | M |
C | 2019-02-01 | F |
A | 2019-09-01 | M |
I want to create a column that gives a count of the number of previous records within the past 60 days, for that specific ID. The first record (datewise) for each ID should have a count 0, as there would be no records before that one. Additionally, the input table is not grouped/sorted by ID or Date.
For this table, the expected output would be:
ID | Date | Gender | Number_in_past60 |
---|---|---|---|
A | 2019-02-01 | M | 1 |
B | 2019-02-01 | M | 0 |
A | 2019-01-01 | M | 0 |
C | 2019-03-01 | F | 1 |
B | 2019-03-01 | M | 1 |
C | 2019-02-01 | F | 0 |
A | 2019-09-01 | M | 0 |
The rows in the output table do not necessarily have to be in the same order as the input table. Any order/grouping is OK, as long as the new count
column is correct.
Upvotes: 0
Views: 295
Reputation: 13049
Not really fast but should do the job
select
t_ext.*,
(
select count(*)
from _table as t_int
where t_int."ID" = t_ext."ID"
and t_int."Date" between t_ext."Date" - 60 and t_ext."Date" - 1
) as "Number_in_past60"
from _table as t_ext;
Upvotes: 1
Reputation: 3585
general direction might looks like that, although it might require some tuning on a real data I guess
with dates as (
select date, id
from mytable
),
grouped as (
select
t.id, count(*) cnt_past60_days
from mytable t
join dates d on t.id = d.id
and date_diff('day', t.date, d.date) between -60 and 0
group by 1
)
select t1.*, g.cnt_past60_days
from mytable t1
left join grouped g on t1.id = g.id and t1.createddate = g.date
Upvotes: 1