Mean of each row on selected indices

Question

I have a matrix holding numerical data with 7 columns, and I want to calculate the mean of selected entries of each row (i.e. only 2 numbers from each row). Another matrix contains information of which entries are selected for each row. What's the best way to do this in R?

       a      b      c       d      e       f       g
[1,]  0.0068  0.0240 0.0014  0.0035 0.0029  0.0293  0.0384
[2,]  0.0197  0.0325 0.0016  0.0163 0.0030  0.0234 -0.0937
[3,] -0.0194 -0.0265 0.0045 -0.0068 0.0029  0.0265  0.0997
[4,]  0.0048  0.0540 0.0015  0.0030 0.0031 -0.0090  0.0580
[5,]  0.0369  0.0112 0.0015  0.0072 0.0029  0.0597 -0.0134
[6,] -0.0025 -0.0325 0.0014  0.0031 0.0034  0.0757  0.0385


     [,1] [,2]
[1,]    2    1
[2,]    2    7
[3,]    2    6
[4,]    6    7
[5,]    7    2
[6,]    7    6

Rui Barradas · Accepted Answer

You can index matrices with another matrix. In this case your index matrix will give 2 index matrices, as in the function below.

First, some data, since you have not posted in a convenient, copy&paste able way.

set.seed(1234)
mat <- matrix(rnorm(6*7), ncol = 7)
inx <- matrix(sample(7, 2*6, TRUE), ncol = 2)

Now the problem.

inxMeans <- function(X, I, na.rm = FALSE){
  inx1 <- cbind(seq_len(nrow(I)), I[, 1])
  inx2 <- cbind(seq_len(nrow(I)), I[, 2])
  rowMeans(cbind(X[inx1], X[inx2]), na.rm = na.rm)
}

inxMeans(mat, inx)
#[1] -0.9916598 -0.2410865 -0.4293729 -0.7624569 -0.2461655 -0.2812934

It's possible to generalize the function above to k columns.

inxMeans2 <- function(X, I, na.rm = FALSE){
  seq_nr <- seq_len(nrow(X))
  res <- apply(I, 2, function(x) X[cbind(seq_nr, x)])
  rowMeans(res, na.rm = na.rm)
}

k <- 4
inx_k <- matrix(sample(7, k*6, TRUE), ncol = k)

inxMeans2(mat, inx_k)
#[1] -0.30121207  1.29338960 -0.05008767 -0.95088480  1.08333762
#[6] -0.26516481

Mean of each row on selected indices

Answers (2)

Related Questions