Jeffrey Kramer
Jeffrey Kramer

Reputation: 1345

Ways to Calculate rolling subtotal

Say I have a table structured like this:

id     dt        val
a     1/1/2012   23
a     2/1/2012   24
a     6/1/2013   12
a     7/1/2013   56
b     1/1/2009   34
b     3/1/2009   78

Every id has a dt in the form of a month, and a value. There may be months missing, but there will never be duplicate months.

I need to calculate a 12-month rolling average for each data point. For example, the fourth row would be (56+12)/12. The third row would be (12)/12. The second row would be (24+23)/12, etc. I need to identify the month (and value) of the maximum moving average for a given ID.

Is this something I can even do in SQL itself, or do I need to export the dataset and use some other method? There are millions of rows, so I'd like to do it in SQL if I can. I've looked at a few of the MA methods and I'm not sure if they will work for what I'm trying to do.

The SQL I am using is a derivative used with Teradata. It supports most of the standard functions that I've needed to use.

Upvotes: 1

Views: 768

Answers (3)

user1509107
user1509107

Reputation:

I think you should take a look at the Windowing functions in Teradata. (Note: all latest ANSI SQL complaint databases support windowing functions to enable users process row-by-row operations instead of set-based ones).

So, using windowing functions I would write something like this:

SELECT ID ,DT ,VAL ,(SUM(VAL)OVER(PARTITION BY YEAR(DT)) )/12.00 AS L12M_mov_avg FROM some.table;

the code above is not tested - but, to just highlight the use of windowing functions.

Upvotes: 0

James Daily
James Daily

Reputation: 597

Assumptions:

  • Your date format is m/d/yyyy (I used format mm/dd/yyyy)
  • id on this table is an FK to some other entity where id is the PK
  • you are meant to take the date of the chosen row, and look for that row and all rows less than 12 months older for that id, and sum the val's in those rows

I'll write this in Oracle SQL because that's what I'm using and you didn't specify ;)

Query Summary:

  • "Chosen" is the instance of your table to serve as the input row
  • "Lookback" gathers all rows including your Chosen row and up to 12 months back minus 1 day
  • Sum up the lookback.val's for your answer
WITH DateTable
      AS (SELECT 'a' id, TO_DATE ('01/01/2012', 'mm/dd/yyyy') dt, 23 val FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('1/1/2012', 'mm/dd/yyyy'), 23 FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('02/01/2012', 'mm/dd/yyyy'), 24 FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('06/01/2013', 'mm/dd/yyyy'), 12 FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('07/01/2013', 'mm/dd/yyyy'), 56 FROM DUAL
            UNION
            SELECT 'b', TO_DATE ('01/01/2009', 'mm/dd/yyyy'), 34 FROM DUAL
            UNION
            SELECT 'b', TO_DATE ('03/01/2009', 'mm/dd/yyyy'), 78 FROM DUAL)
SELECT chosen.id, chosen.dt, SUM (lookback.val)/12
  FROM DateTable chosen, DateTable lookback
 WHERE   chosen.id = 'a' --your input id
         AND chosen.dt = TO_DATE ('07/01/2013', 'mm/dd/yyyy') --your input date
         AND chosen.id = lookback.id
         AND lookback.dt > ADD_MONTHS (chosen.dt, -12)
         AND lookback.dt <= chosen.dt
GROUP BY chosen.id, chosen.dt;

And if you want to query on dates/months not present in any row, do this:

WITH DateTable
      AS (SELECT 'a' id, TO_DATE ('01/01/2012', 'mm/dd/yyyy') dt, 23 val FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('1/1/2012', 'mm/dd/yyyy'), 23 FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('02/01/2012', 'mm/dd/yyyy'), 24 FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('06/01/2013', 'mm/dd/yyyy'), 12 FROM DUAL
            UNION
            SELECT 'a', TO_DATE ('07/01/2013', 'mm/dd/yyyy'), 56 FROM DUAL
            UNION
            SELECT 'b', TO_DATE ('01/01/2009', 'mm/dd/yyyy'), 34 FROM DUAL
            UNION
            SELECT 'b', TO_DATE ('03/01/2009', 'mm/dd/yyyy'), 78 FROM DUAL),
     InputData
      AS (SELECT 'b' id, TO_DATE ('12/15/2009', 'mm/dd/yyyy') dt FROM DUAL)
SELECT InputData.id, InputData.dt, SUM (lookback.val)/12
  FROM DateTable lookback, InputData
 WHERE  lookback.id = InputData.id
    AND lookback.dt > ADD_MONTHS (InputData.DT, -12)
    AND lookback.dt <= InputData.DT
     GROUP BY InputData.id, InputData.dt;

Upvotes: 1

D Stanley
D Stanley

Reputation: 152616

Just use a subquery as the expression:

SELECT id, 
       dt, 
       val, 
       (
        SELECT SUM(val)/12 
        FROM mytable t2 
        WHERE t2.id = t.id
          AND t2.dt > DATEADD(mm, -12, t.dt) 
          AND t2.dt < t.dt
       ) val12MonthAvg 
FROM mytable t

However with millions or rows it's likely to be very slow.

Upvotes: 4

Related Questions