climsaver
climsaver

Reputation: 617

Use values of one vector as index range for another vector

I provide two vectors here:

vec1 <- c(5, 2, 2, 2, 2, 3, 2, 3, 9, 6, 2, 2, 2, 3)

vec2 <- c(1.96845698, 1.11342534, 0.82580110, 0.35762122, 0.07210485, 0.06046759, 0.93615974, 0.85691566, 0.39439991,
          0.26110080, 1.22082336, 0.71940824, 0.32571803, 0.46358160, 0.16009616, 0.13348428, 1.16801097, 0.30184661, 
          0.51190796, 1.69680701, 0.54418158, 0.74969466, 0.17246107, 0.66953561, 1.02689205, 1.67408220, 1.20311478, 
          0.74049935, 0.55211334, 0.31037724, 0.23620425, 0.34532764, 1.64696898, 0.23094382, 0.67733098, 0.32226374, 
          0.25774802, 0.35768477, 0.27219803, 0.02042260, 0.53784081, 1.27521977, 0.07043151, 0.11879638, 0.13358880)

Now I would like to calculate the mean values of different parts from vec2. The length of these parts is determined by the values of vec1.

So the output is supposed to be a vector of the same length as vec1.

The first value of this output vector should be the mean of vec2[1:5], since vec1[1] = 5. The second value should then be the mean of vec[6:7], since vec[2] = 2 and so forth until the last value of the output vector should correspond to the mean of vec2[43:45], since the last value of vec1 is 3.

I hope it is clear what I mean.

Here I calculated manually the expected output vector:

vec3 <- c(0.8674819, 0.4983137, 0.6256578, 0.7409621, 0.5225631, 0.2523873, 0.7349288, 
          0.9176322, 0.7887523, 0.5765066, 0.3077164, 0.1463103, 0.9065303, 0.1076056)

Anybody with an idea, how to do that?

Upvotes: 2

Views: 361

Answers (5)

akrun
akrun

Reputation: 886948

Using tidyverse

library(dplyr)
library(tidyr)
tibble(vec1) %>%
    mutate(grp = row_number()) %>% 
    uncount(vec1) %>% 
    mutate(vec2 = vec2) %>% 
    group_by(grp) %>%
    summarise(vec2 = mean(vec2))
# A tibble: 14 × 2
     grp  vec2
   <int> <dbl>
 1     1 0.867
 2     2 0.498
 3     3 0.626
 4     4 0.741
 5     5 0.523
 6     6 0.252
 7     7 0.735
 8     8 0.918
 9     9 0.789
10    10 0.577
11    11 0.308
12    12 0.146
13    13 0.907
14    14 0.108

Upvotes: 0

tmfmnk
tmfmnk

Reputation: 39858

Yet another option could be:

tapply(vec2, cumsum(sequence(vec1) == 1), mean)

        1         2         3         4         5         6         7         8         9 
0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322 0.7887523 
       10        11        12        13        14 
0.5765066 0.3077164 0.1463103 0.9065303 0.1076056 

Upvotes: 0

Jagge
Jagge

Reputation: 968

Another solution using purrr

# first construct the ranges which is used as input in the purrr-call
range2 <- cumsum(vec1)
range1 <- c(1,cumsum(vec1[1:(length(vec1)-1)])+1)
purrr::map2_dbl(range1, range2, function(x,y) mean(vec2[x:y]))

 [1] 0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322 0.7887523 0.5765066 0.3077164 0.1463103 0.9065303 0.1076056

Upvotes: 0

GKi
GKi

Reputation: 39647

You can try:

tapply(vec2, rep(seq_along(vec1), vec1), mean)
#tapply(vec2, unlist(Map(rep, seq_along(vec1), each=vec1)), mean) #Alternative
#tapply(vec2, inverse.rle(list(lengths=vec1, values=seq_along(vec1))), mean) #Alternative
#        1         2         3         4         5         6         7         8 
#0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322 
#        9        10        11        12        13        14 
#0.7887523 0.5765066 0.3077164 0.1463103 0.9065303 0.1076056 

Upvotes: 3

Miff
Miff

Reputation: 7941

The aggregate function can be used to do this if you rearrange vec1 slightly:

vec1 <- rep(seq_along(vec1), vec1)
aggregate(vec2, list(vec1), mean)$x
# [1] 0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322 0.7887523 0.5765066 0.3077164 0.1463103 0.9065303 0.1076056

Upvotes: 2

Related Questions