Snowflake
Snowflake

Reputation: 3081

Calculate the average per row per set of columns

What I would like to do is the following, given a matrix, for example: mat <- matrix(1:100, nrow = 4) and a set of combinations of the columns c_w <- combn(c(1,2,3,4), 2). I would like to calculate the average per combination. So for the first combination, we have rowMeans(mat[,c_w[,1]]), for the second rowMeans(mat[,c_w[,2]]). So far so good and I can wrap this in a for-loop and then use row combine to combine the results in a nice results matrix. However the problem is performance, if possible I would like to do this in a vectorized manner. So my question is:

can we do this without for loops in the R-code?

edit I would like to have it in Matrix form, where each column stands for the mean of each set. However this can also be achieved with some small additions to Arun's code. Please turn the comment into an answer in order for me to give you points :).

Thanks

Upvotes: 2

Views: 245

Answers (1)

akrun
akrun

Reputation: 887741

We can use the FUN argument in combn to do the rowMeans directly within the combn step after subsetting the columns of 'mat' with the column index derived from combn

 combn(1:4, 2, FUN=function(x) rowMeans(mat[,x]))
 #    [,1] [,2] [,3] [,4] [,5] [,6]
 #[1,]    3    5    7    7    9   11
 #[2,]    4    6    8    8   10   12
 #[3,]    5    7    9    9   11   13
 #[4,]    6    8   10   10   12   14

Or another option if we got the combn output would be to split by the col of 'c_w' and loop through the 'list' elements with sapply, subset the 'mat' with the numeric index and get the rowMeans

 sapply(split(c_w, col(c_w)), function(x) rowMeans(mat[,x]))
 #     1 2  3  4  5  6
 #[1,] 3 5  7  7  9 11
 #[2,] 4 6  8  8 10 12
 #[3,] 5 7  9  9 11 13
 #[4,] 6 8 10 10 12 14

Or a third approach would be concatenate (c), the column index from c_w and use that to get the columns of 'mat', create a array with the specified dimensions. Here, we know that 4 is the number of rows of 'mat', 2 as the 'm' specified in the combn and 6 as the ncol of 'c_w'. Loop with apply, specify the MARGIN as '3', and get the rowMeans.

 apply(array(mat[,c(c_w)], c(4,2,6)), 3, rowMeans)
 #      [,1] [,2] [,3] [,4] [,5] [,6]
 #[1,]    3    5    7    7    9   11
 #[2,]    4    6    8    8   10   12
 #[3,]    5    7    9    9   11   13
 #[4,]    6    8   10   10   12   14

Or as @A.Webb mentioned, apply would be more natural for a matrix like c_w

 apply(c_w,2,function(i) rowMeans(mat[,i]))

Upvotes: 2

Related Questions