Manuel R
Manuel R

Reputation: 4145

Find array index of elements of a matrix that match a value in a vector of candidate values

I have been googeling and stackoverflowing this for a while but I cant seem to find the right answer.

I have a matrix that contains different characters strings like "a", or "gamma", or even numbers coerced to characters.

How do I get the array indices of matrix m if an element of m matches a value in a vector of candiate values (note that these values could be any character string). Here is what I tried. I though which(m %in% ...) would do it but it doesnt return what I expected.

m <- matrix(c(0, "a", "gamma", 0, 0.5, 0, 0, 0, 0), ncol = 3)
m
#>      [,1]    [,2]  [,3]
#> [1,] "0"     "0"   "0" 
#> [2,] "a"     "0.5" "0" 
#> [3,] "gamma" "0"   "0"

which(m == "a", arr.ind = TRUE) # as expected
#>      row col
#> [1,]   2   1

which(m == "a" | m == "gamma", arr.ind = TRUE) # also obvious
#>      row col
#> [1,]   2   1
#> [2,]   3   1

candidates <- c("a", "gamma", "b")
which(m %in% candidates, arr.ind = TRUE) # not what I expected
#> [1] 2 3

Created on 2019-09-11 by the reprex package (v0.3.0)

Any help?

Upvotes: 12

Views: 3327

Answers (3)

moodymudskipper
moodymudskipper

Reputation: 47350

The following function is a variant of %in% that behaves more consistently with ==, this include the behavior with matrices, but also other classes, and it makes sure NAs stay NAs.

`%in{}%` <- function(x, table) {
  table <- unlist(table)
  if (is.list(x) && !is.data.frame(x)) {
    x <- switch(
      typeof(table),
      logical = as.logical(x),
      integer = as.integer(x),
      double = as.double(x),
      complex = as.complex(x),
      character = as.character(x),
      raw = as.raw(x))
  }

  # convert to character
  if (is.factor(table)) {
    table <- levels(table)[table]
  }
  if (is.data.frame(x)){
      res <- sapply(x, `%in%`, table)
    } else if (is.matrix(x)){
    res <- apply(x, 2, `%in%`, table)
  } else {
    res <- x %in% table
  }
  res[is.na(x)] <- NA
  res
}

m <- matrix(c(0, "a", "gamma", 0, 0.5, 0, 0, 0, 0), ncol = 3)
candidates <- c("a", "gamma", "b")
which(m %in{}% candidates, arr.ind = TRUE)
#>      row col
#> [1,]   2   1
#> [2,]   3   1

Upvotes: 2

Shree
Shree

Reputation: 11150

MrFlick's solution is probably one of the best you'll get but if you must stick with in-built functions in base R, here's one way -

which(matrix(m %in% candidates, dim(m)), arr.ind = T)

     row col
[1,]   2   1
[2,]   3   1

Another way with lapply and Reduce but above should be faster -

which(Reduce("|", lapply(candidates, function(x) m == x)), arr.ind = T)

     row col
[1,]   2   1
[2,]   3   1

Upvotes: 4

MrFlick
MrFlick

Reputation: 206546

The problem is that %in% doesn't preserve the dimensions of the input. You can write your own function that would do that. For example

`%matin%` <- function(x, table) {
  stopifnot(is.array(x))
  r <- x %in% table
  dim(r) <- dim(x)
  r
}

candidates <- c("a", "gamma", "b")
which(m %matin% candidates, arr.ind = TRUE)
#      row col
# [1,]   2   1
# [2,]   3   1

Upvotes: 5

Related Questions