Reputation: 617
I provide two vectors here:
vec1 <- c(5, 2, 2, 2, 2, 3, 2, 3, 9, 6, 2, 2, 2, 3)
vec2 <- c(1.96845698, 1.11342534, 0.82580110, 0.35762122, 0.07210485, 0.06046759, 0.93615974, 0.85691566, 0.39439991,
0.26110080, 1.22082336, 0.71940824, 0.32571803, 0.46358160, 0.16009616, 0.13348428, 1.16801097, 0.30184661,
0.51190796, 1.69680701, 0.54418158, 0.74969466, 0.17246107, 0.66953561, 1.02689205, 1.67408220, 1.20311478,
0.74049935, 0.55211334, 0.31037724, 0.23620425, 0.34532764, 1.64696898, 0.23094382, 0.67733098, 0.32226374,
0.25774802, 0.35768477, 0.27219803, 0.02042260, 0.53784081, 1.27521977, 0.07043151, 0.11879638, 0.13358880)
Now I would like to calculate the mean values of different parts from vec2
. The length of these parts is determined by the values of vec1
.
So the output is supposed to be a vector of the same length as vec1
.
The first value of this output vector should be the mean
of vec2[1:5]
, since vec1[1] = 5
. The second value should then be the mean
of vec[6:7]
, since vec[2] = 2
and so forth until the last value of the output vector should correspond to the mean
of vec2[43:45]
, since the last value of vec1
is 3
.
I hope it is clear what I mean.
Here I calculated manually the expected output vector:
vec3 <- c(0.8674819, 0.4983137, 0.6256578, 0.7409621, 0.5225631, 0.2523873, 0.7349288,
0.9176322, 0.7887523, 0.5765066, 0.3077164, 0.1463103, 0.9065303, 0.1076056)
Anybody with an idea, how to do that?
Upvotes: 2
Views: 361
Reputation: 886948
Using tidyverse
library(dplyr)
library(tidyr)
tibble(vec1) %>%
mutate(grp = row_number()) %>%
uncount(vec1) %>%
mutate(vec2 = vec2) %>%
group_by(grp) %>%
summarise(vec2 = mean(vec2))
# A tibble: 14 × 2
grp vec2
<int> <dbl>
1 1 0.867
2 2 0.498
3 3 0.626
4 4 0.741
5 5 0.523
6 6 0.252
7 7 0.735
8 8 0.918
9 9 0.789
10 10 0.577
11 11 0.308
12 12 0.146
13 13 0.907
14 14 0.108
Upvotes: 0
Reputation: 39858
Yet another option could be:
tapply(vec2, cumsum(sequence(vec1) == 1), mean)
1 2 3 4 5 6 7 8 9
0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322 0.7887523
10 11 12 13 14
0.5765066 0.3077164 0.1463103 0.9065303 0.1076056
Upvotes: 0
Reputation: 968
Another solution using purrr
# first construct the ranges which is used as input in the purrr-call
range2 <- cumsum(vec1)
range1 <- c(1,cumsum(vec1[1:(length(vec1)-1)])+1)
purrr::map2_dbl(range1, range2, function(x,y) mean(vec2[x:y]))
[1] 0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322 0.7887523 0.5765066 0.3077164 0.1463103 0.9065303 0.1076056
Upvotes: 0
Reputation: 39647
You can try:
tapply(vec2, rep(seq_along(vec1), vec1), mean)
#tapply(vec2, unlist(Map(rep, seq_along(vec1), each=vec1)), mean) #Alternative
#tapply(vec2, inverse.rle(list(lengths=vec1, values=seq_along(vec1))), mean) #Alternative
# 1 2 3 4 5 6 7 8
#0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322
# 9 10 11 12 13 14
#0.7887523 0.5765066 0.3077164 0.1463103 0.9065303 0.1076056
Upvotes: 3
Reputation: 7941
The aggregate function can be used to do this if you rearrange vec1
slightly:
vec1 <- rep(seq_along(vec1), vec1)
aggregate(vec2, list(vec1), mean)$x
# [1] 0.8674819 0.4983137 0.6256578 0.7409621 0.5225631 0.2523873 0.7349288 0.9176322 0.7887523 0.5765066 0.3077164 0.1463103 0.9065303 0.1076056
Upvotes: 2