Yaya
Yaya

Reputation: 51

dealing with missing values with the R filter() function

I'd like to handle missing values using the filter() function in R.

In fact, I wish to compute X_t = 1/(2*T+1) * sum(X_i, i = (t-T)...(t+T)) where (X_t) is a classical time series containing missing values. filter() computes sums over the time intervals [(t-T);(t+T)] but it does not give the mean of the values excluding the NAs.

Does anyone have any idea how about dealing with that?

Upvotes: 5

Views: 4042

Answers (3)

G. Grothendieck
G. Grothendieck

Reputation: 269596

Try this:

library(zoo)
x <- 1:10
x[6] <- NA
rollapply(x, 3, mean, na.rm = TRUE)
## [1] 2.0 3.0 4.0 4.5 6.0 7.5 8.0 9.0

There are a variety of other arguments that you may or may not need depending on exactly what you want to get out. See ?rollapply .

REVISED Have updated answer based on more recent version of rollapply which allows simplification.

Upvotes: 4

jogr
jogr

Reputation: 31

The sapply trick did not quite work for me. You have to manipulate the initial vector to get it to work with Ks larger than 1. Here is my code:

k <- 1  ## Moving average over three points.
x <- c(rep(1,5), NA, rep(1,5)) # input vector
stmp <- c( rep(NA,k), x, rep(NA,k) )
smooth <- sapply((k+1):(k+length(x)), function(i){mean(x[(i-k):(i+k)], na.rm=TRUE)})

I also added a function statement so the code runs without error. Hope it helps :)

Upvotes: 1

Ian Ross
Ian Ross

Reputation: 987

If you want a simple moving average over 2k+1 points, you can do this:

x <- c(rep(1,5), NA, rep(1,5))
k <- 1  ## Moving average over three points.
smooth <- sapply(1:length(x), mean(x[(i-k):(i+k)], na.rm=TRUE))

which results in a vector of all ones in this case.

Upvotes: 0

Related Questions