Reputation: 44320
I love being able to operate across matrix elements in R with operators like ==
and |
:
(m <- matrix(1:4, nrow=2))
# [,1] [,2]
# [1,] 1 3
# [2,] 2 4
m == 2 | m == 3
# [,1] [,2]
# [1,] FALSE TRUE
# [2,] TRUE FALSE
Unfortunately, %in%
doesn't have this same nice behavior, and returns a vector instead of a matrix:
m %in% c(2, 3)
# [1] FALSE TRUE TRUE FALSE
Noting that %in%
is defined as function(x, table) match(x, table, nomatch = 0L) > 0L
, I figured I could redefine match
to get my desired behavior:
match <- function(x, table, nomatch = NA_integer_, incomparables = NULL) {
m <- base:::match(x, table, nomatch, incomparables)
if (is.matrix(x)) matrix(m, nrow(x))
else m
}
While this does work if I explicitly call match
, I still don't get the desired result when running m %in% c(2, 3)
:
match(m, c(2, 3), nomatch=0L) > 0L
# [,1] [,2]
# [1,] FALSE TRUE
# [2,] TRUE FALSE
m %in% c(2, 3)
# [1] FALSE TRUE TRUE FALSE
Why isn't %in%
now returning a matrix?
Upvotes: 4
Views: 127
Reputation: 44320
Thanks to @joran for pointing me to this excellent article, which clarified for me why %in%
was not using my newly defined match
function. Here's my understanding of what's going on:
The user-defined match
function is stored in the global environment, while the original match
function is still stored in namespace:base
:
environment(match)
# <environment: R_GlobalEnv>
environment(base::match)
# <environment: namespace:base>
Now, consider what happens when I call m %in% c(2, 3)
:
%in%
function, which is just defined as function(x, table) match(x, table, nomatch = 0L) > 0L
.match
function, so it first searches in its local environment that was created as part of the function call. match
is not defined there.match
is the enclosing environment of the function. We can figure out what that is with:
environment(`%in%`)
# <environment: namespace:base>
match
(not the user-defined version) is defined in namespace:base
, this is the version of the function that is called.To get my matrix version of %in%
to work, the simplest approach is to follow the advice of @Molx and redefine %in%
so it's stored in the global environment (note that there's still an identical version of the function in namespace:base
):
`%in%` <- function(x, table) match(x, table, nomatch = 0L) > 0L
environment(`%in%`)
# <environment: R_GlobalEnv>
Now m %in% c(2, 3)
will search for the match
function first in its local function environment and then in the enclosing environment (R_GlobalEnv
), finding our user-defined version of the match
function:
m %in% c(2, 3)
# [,1] [,2]
# [1,] FALSE TRUE
# [2,] TRUE FALSE
Another way we could have gotten %in%
to use the user-defined match
function would be to change the enclosing environment of base::"%in%"
to the global environment:
rm(`%in%`) # Remove user-defined %in%
environment(`%in%`) <- .GlobalEnv # Can be reversed with environment(`%in%`) <- asNamespace("base")
m %in% c(2, 3)
# [,1] [,2]
# [1,] FALSE TRUE
# [2,] TRUE FALSE
As mentioned by the commenters on @Molx's answer, the most sensible thing to do is to avoid all this headache by naming my new function something else like %inm%
.
Upvotes: 3
Reputation: 6931
I'm not sure why your attempt didn't work, but I imagine that %in%
will use base:::match
regardless of your redefined match
. But why not redefine %in%
itself?
`%in%` <- function(x, table) {
m <- base::match(x, table, nomatch = 0L) > 0L
if (is.matrix(x)) matrix(m, nrow(x))
else m
}
m <- matrix(1:4, nrow=2)
m %in% c(2, 3)
# [,1] [,2]
# [1,] FALSE TRUE
# [2,] TRUE FALSE
As suggested in the comments and usually in terms of good practices, it would be safer to use a different name, like %inm%
or %min%
.
Upvotes: 3