Björn Butter
Björn Butter

Reputation: 169

Calculating the mean for intervalls within a variable

Lets suppose I have a dataset like: dat <- rnorm(25) and a vector, which represents specfic indices of my data: v <- c(1, 8, 13, 17, 25)

How can I calculate the mean for the following intervalls: 1-1, 1-8, 8-13, 13-17, 17-25?

In general: I want to average spezific intervalls within dat depending on an index vector v which is meaningfull, but also quite irregular.

Upvotes: 3

Views: 189

Answers (4)

akrun
akrun

Reputation: 887078

Using dplyr

library(dplyr)
tibble(x = dat) %>% 
    group_by(Interval = findInterval(row_number(), v, left.open = TRUE)) %>% 
    summarise(x = mean(x))

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 101317

You can use split() and cut() to create groups, then calculate the means in each group via sapply, i.e.,

r <- sapply(split(dat,cut(seq_along(dat), c(-Inf,v))),mean)

EXAMPLE

set.seed(1)
dat <- rnorm(25)
v <- c(1, 8, 13, 17, 25)
r <- sapply(split(dat,cut(seq_along(dat), c(-Inf,v))),mean)

giving

> r
  (-Inf,1]      (1,8]     (8,13]    (13,17]    (17,25] 
-0.6264538  0.2397270  0.3101554 -0.2877232  0.3456389 

Upvotes: 0

GKi
GKi

Reputation: 39657

You can use cut to get the interval groups and aggregate to calculate mean per group.

aggregate(dat, list(interval=cut(seq(dat), c(0,v))), mean)
#  interval          x
#1    (0,1] -0.5604756
#2    (1,8]  0.3484638
#3   (8,13]  0.1704305
#4  (13,17]  0.4599013
#5  (17,25] -0.6754733

Or in case you want overlaps of the intervals on the first and last position you can use sapply.

sapply(seq(v), function(i) mean(dat[v[max(1,i-1)]:v[i]]))
#[1] -0.56047565  0.23484641 -0.06881816  0.44807533 -0.54510397

Upvotes: 4

Ronak Shah
Ronak Shah

Reputation: 388962

We can use findInterval to form groups and use tapply to get mean for each group.

tapply(dat, findInterval(seq_along(dat), v, left.open = TRUE), mean)

#         0          1          2          3          4 
#-0.5604756  0.3484638  0.1704305  0.4599013 -0.6754733 

data

set.seed(123)
dat <- rnorm(25)
v <- c(1, 8, 13, 17, 25)

Upvotes: 4

Related Questions