Alpha
Alpha

Reputation: 2452

How to calculate rolling average of past 1 month in R

Most packages and posts I found apply mean to a fixed size window or the aggregate month/week data. Is it possible to calculate rolling k month average?

For example, for 1 month rolling window, assuming the data is:

Date          Value
2012-05-28    101
2012-05-25     99
2012-05-24    102
....
2012-04-30     78
2012-04-27     82
2012-04-26     77
2012-04-25     75
2012-04-24     76

The first three rolling 1 month windows should be:

1. 2012-05-28 to 2012-04-30
2. 2012-05-25 to 2012-04-26
3. 2012-05-24 to 2012-04-25

Please note that this is NOT the fixed width rolling window. The window actually changes on the daily basis.

Upvotes: 6

Views: 5351

Answers (3)

GoGonzo
GoGonzo

Reputation: 2867

runner package fully supports rolling windows operations on irregulary spaced time series. To calculate 1-month moving average on x object one have to specify idx = date (to make a runner time dependent) and k = "1 months" or k = 30 (days) depending what is more important to user. User can apply any R function - in this case we execute mean.

# example data
x <- cumsum(rnorm(20))
date <- Sys.Date() + cumsum(sample(1:5, 20, replace = TRUE)) # unequaly spaced time series

# calculate rolling average
runner::runner(
  x = x, 
  k = "1 months", 
  idx = date, 
  f = mean
)

Upvotes: 0

CephBirk
CephBirk

Reputation: 6710

Assuming your data frame is df this works for me:

df$past_avg = sapply(df$Date, function(i){
    i = as.POSIXct(i)
    mean(subset(df, Date > (i - months(1)) & Date < i)$Value)
})

Uses just base R. You can adjust to however many months in the past you want by changing the value in months().

Upvotes: 0

wcampbell
wcampbell

Reputation: 821

I used this code to calculate monthly averages based on daily price data.

#function for extracting month is in the lubridate package
install.packages(c("plyr", "lubridate"))
require(plyr); require(lubridate)

#read the daily data
daily = read.csv("daily_lumber_prices.csv")
price = daily$Open
date = daily$Date

#convert date to a usable format
date = strptime(date, "%d-%b-%y")
mon = month(date)
T = length(price)

#need to know when months change
change_month = rep(0,T)

for(t in 2:T){
  if(mon[t] != mon[t-1]){
    change_month[t-1] = 1
  }
}

month_avg = rep(0,T)
total = 0
days = 0

for(t in 1:T){
  if(change_month[t] == 0){
    #cumulative sums for each variable
    total = total + price[t] 
    days = days + 1
  }

  else{
    #need to include the current month in the calculation
    month_avg[t] = (total + price[t]) / (days + 1)
    #reset the variables
    total = 0
    days = 0
  }
}

So, the variable month_avg is storing the monthly averages.

Is it something like this? This code accounts for the variable lengths of months. There's certainly a more efficient way to do it, but this works!

Upvotes: 1

Related Questions