Carlos Siestrup
Carlos Siestrup

Reputation: 1216

Get avareage of n previous months by month query

I'm making a a query that gets the count of 2 columns and sum of another column, grouping by month from a date column of the last 13 months. This is my query :

SELECT  TO_CHAR(colDate,'yyyy_MM') as month ,
        COUNT(DISTINCT col1) AS col1,
        COUNT(DISTINCT col2) as col2,
        SUM(col3) as col3 
FROM myTable
WHERE TO_CHAR(colDate,'yyyy_MM') IN (select distinct TO_CHAR(colDate,'yyyy_MM')
                                     from myTable
                                     order by  1 desc
                                     limit 13)
GROUP BY 1

The problem is that for each month, I also need the average of the previous 3 months of the :

COUNT(DISTINCT col1) AS col1, COUNT(DISTINCT col2) as col2, SUM(col3) as col3

so my query need to be something like :

SELECT  TO_CHAR(colDate,'yyyy_MM') as month ,
            COUNT(DISTINCT col1) AS col1,
            COUNT(DISTINCT col2) as col2,
            SUM(col3) as col3,
            ... as PreviousMonthsAvgCol1,
            ... as PreviousMonthsAvgCol2,
            ... as PreviousMonthsAvgCol3
    FROM myTable
    WHERE TO_CHAR(colDate,'yyyy_MM') IN (select distinct TO_CHAR(colDate,'yyyy_MM')
                                         from myTable
                                         order by  1 desc
                                         limit 13)
    GROUP BY 1

The months before the first still need to be counted in the average of the first month.

Upvotes: 0

Views: 44

Answers (2)

Joe Harris
Joe Harris

Reputation: 14045

Agree, with Gordon Lindoff's answer.

However, I recommend not using TO_CHAR() in the date range predicate. This will force Redshift to scan more data than necessary.

Try using colDate BETWEEN '2017-01-01' and '2018-01-31' or DATE_TRUNC() if you must round the dates to whole months.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271003

If you don't need data from before the 13 month period, just use lag():

SELECT . . .,
       LAG(COUNT(DISTINCT col1)) OVER (ORDER BY MIN(colDate)) as prev_col1,
   . . . 
FROM myTable . . .;

If you do need earlier data, then do the full aggregation and then choose the 13 months.

Upvotes: 1

Related Questions