Reputation: 2452
Most packages and posts I found apply mean to a fixed size window or the aggregate month/week data. Is it possible to calculate rolling k month average?
For example, for 1 month rolling window, assuming the data is:
Date Value
2012-05-28 101
2012-05-25 99
2012-05-24 102
....
2012-04-30 78
2012-04-27 82
2012-04-26 77
2012-04-25 75
2012-04-24 76
The first three rolling 1 month windows should be:
1. 2012-05-28 to 2012-04-30
2. 2012-05-25 to 2012-04-26
3. 2012-05-24 to 2012-04-25
Please note that this is NOT the fixed width rolling window. The window actually changes on the daily basis.
Upvotes: 6
Views: 5351
Reputation: 2867
runner package fully supports rolling windows operations on irregulary spaced time series. To calculate 1-month moving average on x
object one have to specify idx = date
(to make a runner time dependent) and k = "1 months"
or k = 30
(days) depending what is more important to user. User can apply any R function - in this case we execute mean
.
# example data
x <- cumsum(rnorm(20))
date <- Sys.Date() + cumsum(sample(1:5, 20, replace = TRUE)) # unequaly spaced time series
# calculate rolling average
runner::runner(
x = x,
k = "1 months",
idx = date,
f = mean
)
Upvotes: 0
Reputation: 6710
Assuming your data frame is df
this works for me:
df$past_avg = sapply(df$Date, function(i){
i = as.POSIXct(i)
mean(subset(df, Date > (i - months(1)) & Date < i)$Value)
})
Uses just base R. You can adjust to however many months in the past you want by changing the value in months()
.
Upvotes: 0
Reputation: 821
I used this code to calculate monthly averages based on daily price data.
#function for extracting month is in the lubridate package
install.packages(c("plyr", "lubridate"))
require(plyr); require(lubridate)
#read the daily data
daily = read.csv("daily_lumber_prices.csv")
price = daily$Open
date = daily$Date
#convert date to a usable format
date = strptime(date, "%d-%b-%y")
mon = month(date)
T = length(price)
#need to know when months change
change_month = rep(0,T)
for(t in 2:T){
if(mon[t] != mon[t-1]){
change_month[t-1] = 1
}
}
month_avg = rep(0,T)
total = 0
days = 0
for(t in 1:T){
if(change_month[t] == 0){
#cumulative sums for each variable
total = total + price[t]
days = days + 1
}
else{
#need to include the current month in the calculation
month_avg[t] = (total + price[t]) / (days + 1)
#reset the variables
total = 0
days = 0
}
}
So, the variable month_avg is storing the monthly averages.
Is it something like this? This code accounts for the variable lengths of months. There's certainly a more efficient way to do it, but this works!
Upvotes: 1