Reputation: 607
I have a below query which I run gives me the single count for month
as current month and dates_for_week
has list of all the dates for last week from Sunday to Saturday.
select COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and (month = '10' and dates_for_week IN ('18', '19', '20', '21', '22', '23', '24'))
As of now the output I see is this -
Count
-----
982
Now I am trying to make this query dynamic such that it can give me count for past 6 weeks something like below as an output:
Count Week
------------
982 W43
123 W42
126 W41
127 W40
128 W39
129 W38
I am able to convert above query in a dynamic way which gives me the count for current month october and previous week which is 43 and it works fine as shown below but I am not sure how can I change it so that it can give me data for all past 6 weeks in the above output format. It looks like I need to change the month also dynamically for some week to get output for past 6 weeks.
select COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and (
month = extract(month from current_date)
and dates_for_week IN (
select
date_part('d',((DATE_TRUNC('week', CURRENT_DATE) - 9) + row_number() over (order by true))::date)
from process.data
limit 7
)
)
So what I need is this for last 6 week and group by week to give me the count as shown above. Is this possible to do by any chance?
and (month = '10' and dates_for_week IN ('18', '19', '20', '21', '22', '23', '24'))
and (month = '10' and dates_for_week IN ('11', '12', '13', '14', '15', '16', '17'))
and (month = '10' and dates_for_week IN ('4', '5', '6', '7', '8', '9', '10'))
and (month IN ('9', '10') and dates_for_week IN ('27', '28', '29', '30', '1', '2', '3'))
and (month = '9' and dates_for_week IN ('20', '21', '22', '23', '24', '25', '26'))
and (month = '9' and dates_for_week IN ('13', '14', '15', '16', '17', '18', '19'))
Upvotes: 0
Views: 971
Reputation: 2956
You have years, months and days in seperate columns if I understand it correctly. I think the easiest way is to "build" a proper date column and then working with this column.
The following query should give you the last 6 weeks including the current week.
select
EXTRACT(week from TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD')) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
and TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD') >= DATEADD(day,-42,DATE_TRUNC('week', sysdate))
GROUP BY 1
ORDER BY 1 desc
However, there might be a challenge as weeks start on a Monday in redshift, so might need to do a slight manipulation (adding one day):
select
EXTRACT(week from DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD'))) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
and DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD')) BETWEEN DATEADD(day,-42,DATE_TRUNC('week', sysdate)) AND DATEADD(day,-1,DATE_TRUNC('week', sysdate))
GROUP BY 1
ORDER BY 1 desc
I would start running this query first, to check if the date is calculated properly
select COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD') between '2020-10-18' and '2020-10-24'
Afterwards I would see if the week is calculated correctly:
select
EXTRACT(week from DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD'))) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD') between '2020-10-18' and '2020-10-24'
group by 1
order by 1
And last but not least I would extend the timeframe and make it dynamic:
select
EXTRACT(week from DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD'))) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD')) Between DATEADD(day,-42,DATE_TRUNC('week', sysdate)) and DATEADD(day,-1,DATE_TRUNC('week', sysdate))
group by 1
order by 1
Upvotes: 2
Reputation: 1271003
You could use order by
and limit
:
select year, week, COUNT(DISTINCT CLIENTID)
from process.data
where type = 'pots' and
stype= 'kites' and
tires IN ('abc', 'def', 'ghi', 'jkl') and
comp IN ('data', 'hello', 'world')
group by year, dates_for_week
order by year desc, week desc
limit 6;
This is assuming that you have a week column, which seems like a reasonable assumption.
This is a simple way to accomplish what you want to do. I am guessing that on Redshift it should have decent performance.
Upvotes: 0
Reputation: 4354
Assuming that you have some kind of date column, you can simply use something like this
select date_part(w, {your_date_column) as week_number,
COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
group by 1
Upvotes: 0