Saurabh
Saurabh

Reputation: 1626

Simple moving average (partial window) of a vector using data.table in R

I have a simple vector as follows:

x = c(14.24, 13.82, 12.75, 12.92, 12.94, 13.00, 14.14, 16.28, 20.64, 17.64)

I am trying to find the rolling mean of this vector and using the following function:

library(data.table)
y = frollmean(x, 5)

I am getting the result as follows -

 [1]     NA     NA     NA     NA 13.334 13.086 13.150 13.856 15.400 16.340

However, I want the result as -

 [1]     14.24 14.03 13.06 13.43 13.334 13.086 13.150 13.856 15.400 16.340
  1. The first value should be the same as in the original vector
  2. The second value should be the mean of the first and second value
  3. The third value should be the mean of the initial three values in the vector
  4. The fourth value should be the mean of the initial four values in the vector

The rest of the calculation is correctly handled by the function frollmean

Any suggestions will be helpful.

Thanks!

Upvotes: 2

Views: 1025

Answers (3)

Andre Wildberg
Andre Wildberg

Reputation: 19211

Put this in your ~/.Rprofile if you want to include the functionality in your base R arsenal:

rollmean <- function(vec, len, prtl = FALSE) {
  if (len > length(vec)) {
    stop(paste("Choose lower range,", len, ">", length(vec)))
  }
  else {
    if (prtl == T) {
      sapply(1:length(vec), function(i) {
        if (i <= len) {
          mean(vec[1:i])
        }
        else {
          mean(vec[(i - (len - 1)):i])
        }
      })
    }
    else {
      sapply(1:length(vec), function(i) {
        if (i - (len - 1) > 0) {
          mean(vec[(i - (len - 1)):i])
        }
        else {
          NA
        }
      })
    }
  }
}

x <- c(14.24, 13.82, 12.75, 12.92, 12.94, 13.00, 14.14, 16.28, 20.64, 17.64)

rollmean( x, 5 )
#[1]     NA     NA     NA     NA 13.334 13.086 13.150 13.856 15.400 16.340

rollmean( x, 5, T )
#[1] 14.24000 14.03000 13.60333 13.43250 13.33400 13.08600 13.15000 13.85600
#[9] 15.40000 16.34000

Upvotes: 1

jangorecki
jangorecki

Reputation: 16727

In your frollmean function call you are asking for window of width 5. All of leading elements which are NA could not be computed for this window width because there are not enough elements. Partial window support is presented in ?frollmean manual examples. Below is code adapted from examples to your case.

x = c(14.24, 13.82, 12.75, 12.92, 12.94, 13.00, 14.14, 16.28, 20.64, 17.64)
library(data.table)
an = function(n, len) c(seq.int(n), rep(n, len-n))
y = frollmean(x, an(5, length(x)), adaptive=TRUE)
y
# [1] 14.24000 14.03000 13.60333 13.43250 13.33400 13.08600 13.15000 13.85600 15.40000 16.34000

Argument adaptive is much more generic (than just logical partial) way to customize window width. Here we use helper function an to calculate adaptive length of window. If you call just an you can observe that window width is now exactly how you expected it to be, rather than 5:

an(5, length(x))
# [1] 1 2 3 4 5 5 5 5 5 5

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 389275

You can use zoo's rollapplyr function with partial = TRUE.

zoo::rollapplyr(x, 5, mean, partial = TRUE)
#[1] 14.24 14.03 13.60 13.43 13.33 13.09 13.15 13.86 15.40 16.34

Upvotes: 2

Related Questions