Reputation: 418
I'm looking for a fast way to return the indices of columns of a matrix that match values provided in a vector (ideally of length 1 or the same as the number of rows in the matrix) for instance:
mat <- matrix(1:100,10)
values <- c(11,2,23,12,35,6,97,3,9,10)
the desired function, which I call rowMatches()
would return:
rowMatches(mat, values)
[1] 2 1 3 NA 4 1 10 NA 1 1
Indeed, value 11 is first found at the 2nd column of the first row, value 2 appears at the 1st column of the 2nd row, value 23 is at the 3rd column of the 3rd row, value 12 is not in the 4th row... and so on.
Since I haven't found any solution in package matrixStats, I came up with this function:
rowMatches <- function(mat,values) {
res <- integer(nrow(mat))
matches <- mat == values
for (col in ncol(mat):1) {
res[matches[,col]] <- col
}
res[res==0] <- NA
res
}
For my intended use, there will be millions of rows and few columns. So splitting the matrix into rows (in a list called, say, rows
) and calling Map(match, as.list(values), rows)
would be way too slow.
But I'm not satisfied by my function because there is a loop, which may be slow if there are many columns. It should be possible to use apply()
on columns, but it won't make it faster.
Any ideas?
Upvotes: 1
Views: 2498
Reputation: 418
Roland's answer is good, but I'll post an alternative solution:
res <- which(mat==values, arr.ind = T)
res <- res[match(seq_len(nrow(mat)), res[,1]), 2]
Upvotes: 1
Reputation: 132706
res <- arrayInd(match(values, mat), .dim = dim(mat))
res[res[, 1] != seq_len(nrow(res)), 2] <- NA
# [,1] [,2]
# [1,] 1 2
# [2,] 2 1
# [3,] 3 3
# [4,] 2 NA
# [5,] 5 4
# [6,] 6 1
# [7,] 7 10
# [8,] 3 NA
# [9,] 9 1
#[10,] 10 1
Upvotes: 2