Reputation: 224
First, full disclosure. I attempted to do this strictly in MS Access with correlated subqueries, and had some help on this post 12 month moving average by person, date. I originally thought my data would be small enough to chug through, but it is awful. As an alternative, I'm going to try running this in R and then writing results to a new table in MS Access. I have data such that I have the following fields:
rep, cyc_date, amt
Following the linked example by Andrie for a rolling 5-year period (as opposed to the 5-year average) R: Calculating 5 year averages in panel data, I am trying to get rolling 12 month average for amt
field by rep
. Here is my code:
library(zoo)
library(plyr)
library(RODBC)
# Pull data from local MS Access database. The referenced sqlFetch is a query
# that pulls the data, ordered by `rep`, then `cyc_date`
channel <- odbcConnectAccess2007("C://MyDB.accdb")
data <- data.frame(sqlFetch(channel, "MyView"))
# Ensure coercion of `cyc_date` to date type
data$cyc_date <- as.Date(data$cyc_date)
# Function (take from post above)
rollmean12 <- function(x) {
rollmean(x, 12)
}
# Calculate rolling average by person
rollvec <- ddply(data, .(data$rep), rollmean12(data$amt))
Unfortunately, this doesn't work. I'm getting the following error:
Error in llply(.data = .data, .fun = .fun, ..., .progress = .progress, :
.fun is not a function.
I'm not sure why this is happening. Do I need to explicitly convert data
to a zoo
object? If so, not sure how to handle the extra dimensionality resulting from the person_id
field. Any help would be very much appreciated.
Upvotes: 1
Views: 4754
Reputation: 696
Try the tidyquant
library
x %>% tq_mutate(
# tq_mutate args
select = amt,
mutate_fun = rollapply,
col_rename = "rollmean12", ####
# rollapply args
width = 12,
align = "right",
FUN = mean,
# mean args
na.rm = TRUE
)
Upvotes: 0
Reputation: 224
I found this code on the following post: applying rolling mean by group in R
data$movavg <- ave(data$amt, data$rep, FUN = function(x) rollmean(x, k=12, align="right", na.pad=T)).
ave
saves the day!
Upvotes: 3
Reputation: 6213
Just some hints, as I don't work at all with time series: ddply
requires a data frame input, so don't convert it to a zoo
object. .(data$rep)
I think should be just .(rep)
, and rollmean12
should not be called with arguments. Rather, you should re-write the function to extract the columns you want. So, approximately something like this:
rollmean12 <- function(x) rollmean(x$amt, 12)
If you do ?ddply
there is a link to a very helpful publication in JSS.
Upvotes: 0