Sega Joey
Sega Joey

Reputation: 33

Calculate mean down a column by an interval in R

I was unable to find a duplicate of my question, so I hope you can help.

Using a simple example, I wish to calculate mean/average down a column, based on a specified window size (calling it n).

data <- data.frame(x = rep(1:10,1), y = rep(11:20, 1))

I wish to add a column z, which calculates the average of 4 rows at a time.

So result will be:

structure(list(x = 1:10, y = 11:20, z = c("NA", "NA", "NA", "12.5", 
"13.5", "14.5", "15.5", "16.5", "17.5", "18.5")), class = "data.frame", .Names = c("x", 
"y", "z"), row.names = c(NA, -10L))

I calculated row averages down a column, in intervals of n rows as follows:

#For n = 4, row 4 is calculated as (11+12+13+14)/n
#For n =4, row 5 is calculated as (12+13+14+15)/n
#And so on ...

I looked at following posts such as:

  1. how to calculate combined column mean value in R
  2. Calculate mean by group
  3. How to calculate average of a variable by hour in R
  4. Calculate the mean of every 13 rows in data frame
  5. calculate a mean by criteria in R

I attempted this code below, but I am unable to obtain the write solutions.

data<-data %>% mutate(z=rollapplyr(y,10,FUN=mean,by=4))

Appreciate your help. Thank you

Upvotes: 1

Views: 2788

Answers (2)

jay.sf
jay.sf

Reputation: 72974

You could use outer() with a customized function. The diag() gives you the desired values.

myMean <- function(x, y) mean(dat[seq(x, y), 2])
mmean <- diag(outer(1:nrow(dat), (4:nrow(dat)), Vectorize(myMean)))

dat$z <- NA  # initialize column
dat$z[-(1:3)] <- mmean

#     x  y    z
# 1   1 11   NA
# 2   2 12   NA
# 3   3 13   NA
# 4   4 14 12.5
# 5   5 15 13.5
# 6   6 16 14.5
# 7   7 17 15.5
# 8   8 18 16.5
# 9   9 19 17.5
# 10 10 20 18.5

Data

dat <- data.frame(x=rep(1:10, 1), y=rep(11:20, 1))

Upvotes: 0

Hunaidkhan
Hunaidkhan

Reputation: 1418

You can do it using rolling mean of library zoo

data <- data.frame(x = rep(1:10,1), y = rep(11:20, 1))

result <- structure(list(x = 1:10, y = 11:20, z = c("NA", "NA", "NA", "12.5", 
                                                    "13.5", "14.5", "15.5", "16.5", "17.5", "18.5")), class = "data.frame", .Names = c("x", 
                                                                                                                                       "y", "z"), row.names = c(NA, -10L))

## Answer

library(zoo)
data$z <- rollmeanr(data$y,4,fill=NA)

Upvotes: 2

Related Questions