user2797174
user2797174

Reputation: 167

sum of preceding 12 months in R

I am trying to sum rows that are withing last 12 months period.

df<- read.table(header=T, text='yyyymm      amount
200809  261900
200810  149000
200908  120000
201104  81500
201107  30000
201112  75000
201306  56000
201310  98000
201311  40000
201402  95000
201408  28900
201505  33000
201511  65000
201601  78000
201610  50000
201701  50000
201711  81500')

For example, for row one(200809==Sept, 2008), I don't have any previous data that are within past 12 months, so the new variable I want to create (sum_amt) ends up in sum_amt=NA. For row two, I have one record that is within past 12 months of 200810 (i.e. between 200710 & 200809), so sum_amt=261999. Similarly, for row three, I have two records that are within 12 months of 200908, so sum_amt=410900 (261900 + 261900) and so on.

Final results would be:

yyyymm  amount  sum_amt
200809  261900  NA
200810  261900  261900
200908  120000  410900
201104  81500   NA
201107  30000   81500
201112  75000   111500
201306  56000   NA
201310  98000   56000
201311  40000   154000
201402  95000   194000
201408  28900   233000
201505  33000   28900
201511  65000   33000
201601  78000   98000
201610  50000   143000
201701  50000   128000
201711  81500   50000

Thank you in advanced!!

Upvotes: 1

Views: 568

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 270348

Define a function sumfun which accepts a zoo series with yearmon time index. yearmon class internally represents a year and month as year + 0 for Jan, year + 1/12 for Feb and so on so subtracting 1 from a yearmon object gives one year back. sumfun returns the sum of the amounts in the last year or NA if there are no points before the current point. Then read df into zoo series amount and perform a rolling sum over it using sumfun. Finally convert back to data frame.

library(zoo)

sumfun <- function(x) {
  y <- x[time(x) < end(x) & time(x) >= end(x) - 1]
  if (length(y) == 0) NA else sum(y)
}

amount <- read.zoo(df, FUN = function(x) as.yearmon(paste(x), "%Y%m"))
sum12 <- rollapplyr(amount, 13, sumfun, partial = TRUE, fill = NA, coredata = FALSE)
fortify.zoo(cbind(amount, sum12), name = names(df)[1])

giving:

     yyyymm amount  sum12
1  Sep 2008 261900     NA
2  Oct 2008 149000 261900
3  Aug 2009 120000 410900
4  Apr 2011  81500     NA
5  Jul 2011  30000  81500
6  Dec 2011  75000 111500
7  Jun 2013  56000     NA
8  Oct 2013  98000  56000
9  Nov 2013  40000 154000
10 Feb 2014  95000 194000
11 Aug 2014  28900 233000
12 May 2015  33000  28900
13 Nov 2015  65000  33000
14 Jan 2016  78000  98000
15 Oct 2016  50000 143000
16 Jan 2017  50000 128000
17 Nov 2017  81500  50000

Upvotes: 2

LocoGris
LocoGris

Reputation: 4480

This is my proposal:

df<- read.table(header=T, text='yyyymm      amount
200809  261900
                200810  149000
                200908  120000
                201104  81500
                201107  30000
                201112  75000
                201306  56000
                201310  98000
                201311  40000
                201402  95000
                201408  28900
                201505  33000
                201511  65000
                201601  78000
                201610  50000
                201701  50000
                201711  81500')

df <- as.data.table(df)
df[, yyyymmdd:=ymd(paste0(yyyymm,"01"))]
sum_year <- function(origin){
  sum(df[(origin-yyyymmdd)<=366 &(origin-yyyymmdd)>0]$amount)
}
df[, sum_amut:=mapply(sum_year, yyyymmdd)]

This will work under two assumptions:

  1. yyyymm is always in this format. If you add days, then you will have problems with leap years and not leap years.
  2. If two yyyymm are repeated, you do not consider them in the sum.

And, finally, it returns 0 instead of NA when no date last 12 months is found.

Best

Upvotes: 1

Related Questions