Urvah Shabbir
Urvah Shabbir

Reputation: 985

R moving average function to deal with values less window size

Following this answer, I have used a moving average function for a window size of 2, 3 and 4.

require(zoo)
#MOVING AVERAGE FUNCTION
get.mav <- function(df, n = 2){
  if(length(df) < n){
    return(df)
  } 
  c(df[1:(n-1)],rollapply(df,width = n, mean, align="right"))
}

#DATA FRAME (dummy)
ID <- c("d","b","a","a","c","e","b","d","b","b")
Value <- c(4,5,5,3,2,1,6,9,5,5)
df <-data.frame(ID,Value)

# FUNCTION IMPLEMENTATION
df <- with(df,df[order(ID),])
df$mav2 <- unlist(aggregate(Value~ID,df,get.mav,na.action = NULL,n=2)$Value) 
df$mav3 <- unlist(aggregate(Value~ID,df,get.mav,na.action = NULL,n=3)$Value) 
df$mav4 <- unlist(aggregate(Value~ID,df,get.mav,na.action = NULL,n=4)$Value) 

#OUTPUT
ID  Value   mav2    mav3    mav4
a   5   5   5   5
a   3   4   3   3
b   5   5   5   5
b   6   5.5 6   6
b   5   5.5 5.3 5
b   5   5   5.3 5.25
c   2   2   2   2
d   4   4   4   4
d   9   6.5 9   9
e   1   1   1   1

The function get.mav works exactly the way it should. I want to change this function such that

For window size 3, if df length is 2, it takes the mean of those two elements rather than simply returning df.

Similarly for window size 4, if length is 3 or 2, it takes mean of those three or two elements rather simply returning df.

I tried the if statements but comparisons are not working correctly. Any help would be appreciated.

Thanks.

Upvotes: 2

Views: 604

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269491

For each width use ave to to invoke rollapplyr by ID. partial = TRUE in rollapplyr causes it to average partial number of points near the beginning.

library(zoo)

roll <- function(x, group, w) {
   ave(x, group, FUN = function(x) rollapplyr(x, w, mean, partial = TRUE))
}

transform(df[order(df$ID), ], 
   mav2 = roll(Value, ID, 2), 
   mav3 = roll(Value, ID, 3), 
   mav4 = roll(Value, ID, 4)
)

or alternatively:

w <- 2:4
names(w) <- paste0("mav", w)

with(df[order(df$ID), ],
   data.frame(ID, Value, lapply(w, roll, x = Value, group = ID), check.names = FALSE)
)

Either gives:

   ID Value mav2     mav3     mav4
1   a     5  5.0 5.000000 5.000000
2   a     3  4.0 4.000000 4.000000
3   b     5  5.0 5.000000 5.000000
4   b     6  5.5 5.500000 5.500000
5   b     5  5.5 5.333333 5.333333
6   b     5  5.0 5.333333 5.250000
7   c     2  2.0 2.000000 2.000000
8   d     4  4.0 4.000000 4.000000
9   d     9  6.5 6.500000 6.500000
10  e     1  1.0 1.000000 1.000000

Update: Fixed.

Upvotes: 2

Related Questions