Reputation: 2602
I have a data.frame with columns from different groups (here, a
and b
), and another data.frame containing the weights to perform a weighted mean:
test = data.frame(a.1=rep(2,5), b.1=rep(3,5), a.2=6:10, b.2=11:15)
tweights = data.frame(name=c('a.1', 'b.1', 'a.2', 'b.2'),
w=c(0.2, 0.33, 0.8, 0.67))
For each line in test
, I want to perform a weighted mean for the columns containing a
with the weights given by their corresponding value in tweights
, and the same for the columns with b
.
What I tried to do:
test %>% rowwise() %>%
mutate(awmean = weighted.mean(c(a.1, a.2),
tweights$w[grepl('a', tweights$name)]),
bwmean = weighted.mean(c(b.1, b.2),
tweights$w[grepl('b', tweights$name)]))
This is working fine but this is not efficient nor elegant, I woudl like to avoid explicitly mentioning the column names (a.1
, a.2
etc), and the second part invoking grepl
doesn't look very clean to me neither...
I tried something like this but it is wrong:
test %>% rowwise() %>%
mutate(awmean = weighted.mean(contains('a'),
tweights$w[grepl('a', tweights$name)]))
Error: error in evaluating the argument 'x' in selecting a method
for function 'weighted.mean': Error: could not find function "contains"
Note that I assume here that the order of the columns a.1 : a.n
and the order of the corresponding lines in tweights
is the same, which can be OK. A solution really taking care of the matching between the values and weights in weighted.mean
would be even better...
Upvotes: 2
Views: 841
Reputation: 9592
Maybe a custom function?
# get weighted means, for names beginning with a certain letter
getWM <- function(letter1) {
rgx <- paste0('^', letter1)
apply(test[, grep(rgx, names(test))], 1, weighted.mean,
w = subset(tweights, grepl(rgx, name))$w )
}
Now you can just make a call like:
getWM('a')
[1] 5.2 6.0 6.8 7.6 8.4
Or, for all letters:
first_letters <- unique(gsub('[^a-z]', '', names(test)))
sapply(first_letters, getWM)
a b
[1,] 5.2 8.36
[2,] 6.0 9.03
[3,] 6.8 9.70
[4,] 7.6 10.37
[5,] 8.4 11.04
Upvotes: 1