Reputation: 1216
I'm making a a query that gets the count of 2 columns and sum of another column, grouping by month from a date column of the last 13 months. This is my query :
SELECT TO_CHAR(colDate,'yyyy_MM') as month ,
COUNT(DISTINCT col1) AS col1,
COUNT(DISTINCT col2) as col2,
SUM(col3) as col3
FROM myTable
WHERE TO_CHAR(colDate,'yyyy_MM') IN (select distinct TO_CHAR(colDate,'yyyy_MM')
from myTable
order by 1 desc
limit 13)
GROUP BY 1
The problem is that for each month, I also need the average of the previous 3 months of the :
COUNT(DISTINCT col1) AS col1, COUNT(DISTINCT col2) as col2, SUM(col3) as col3
so my query need to be something like :
SELECT TO_CHAR(colDate,'yyyy_MM') as month ,
COUNT(DISTINCT col1) AS col1,
COUNT(DISTINCT col2) as col2,
SUM(col3) as col3,
... as PreviousMonthsAvgCol1,
... as PreviousMonthsAvgCol2,
... as PreviousMonthsAvgCol3
FROM myTable
WHERE TO_CHAR(colDate,'yyyy_MM') IN (select distinct TO_CHAR(colDate,'yyyy_MM')
from myTable
order by 1 desc
limit 13)
GROUP BY 1
The months before the first still need to be counted in the average of the first month.
Upvotes: 0
Views: 44
Reputation: 14045
Agree, with Gordon Lindoff's answer.
However, I recommend not using TO_CHAR()
in the date range predicate. This will force Redshift to scan more data than necessary.
Try using colDate BETWEEN '2017-01-01' and '2018-01-31'
or DATE_TRUNC()
if you must round the dates to whole months.
Upvotes: 0
Reputation: 1271003
If you don't need data from before the 13 month period, just use lag()
:
SELECT . . .,
LAG(COUNT(DISTINCT col1)) OVER (ORDER BY MIN(colDate)) as prev_col1,
. . .
FROM myTable . . .;
If you do need earlier data, then do the full aggregation and then choose the 13 months.
Upvotes: 1