apply a rolling mean to a database by an index

Question

I would like to calculate a rolling mean on data in a single data frame by multiple ids. See my example dataset below.

date <- as.Date(c("2015-02-01", "2015-02-02", "2015-02-03", "2015-02-04", 
          "2015-02-05", "2015-02-06", "2015-02-07", "2015-02-08",  
          "2015-02-09", "2015-02-10", "2015-02-01", "2015-02-02", 
          "2015-02-03", "2015-02-04", "2015-02-05", "2015-02-06", 
          "2015-02-07", "2015-02-08", "2015-02-09", "2015-02-10"))
index <- c("a","a","a","a","a","a","a","a","a","a",
           "b","b","b","b","b","b","b","b","b","b")
x <- runif(20,1,100)
y <- runif(20,50,150)
z <- runif(20,100,200)

df <- data.frame(date, index, x, y, z)

I would like to calculate the rolling mean for x, y and z, by a and then by b.

I tried the following, but I am getting an error.

test <- tapply(df, df$index, FUN = rollmean(df, 5, fill=NA))

The error:

Error in xu[k:n] - xu[c(1, seq_len(n - k))] : 
  non-numeric argument to binary operator

It seems like there is an issue with the fact that index is a character, but I need it in order to calculate the means...

Dave Gruenewald · Accepted Answer

This ought to do the trick using the library dplyr and zoo:

library(dplyr)
library(zoo)

df %>% 
  group_by(index) %>% 
  mutate(x_mean = rollmean(x, 5, fill = NA),
         y_mean = rollmean(y, 5, fill = NA),
         z_mean = rollmean(z, 5, fill = NA))

You could probably tidy this up more using mutate_each or some other form of mutate.

You can also change the arguments within rollmean to fit your needs, such as align = "right" or na.pad = TRUE

apply a rolling mean to a database by an index

Answers (2)

Related Questions