Reputation: 2514
I am trying to check whether strings in a column appear in a different column. I tried grepl
:
grepl("b", "d,b,c", fixed = TRUE)
> TRUE
which works fine on "standalone" objects, but in a dataframe:
df = data.frame(id = c("a","b"), ids = c("b,c", "d,b,c")) %>%
mutate(match = grepl(id, .$ids, fixed = TRUE), truematch = c(FALSE, TRUE))
> df
id ids match truematch
1 a b,c FALSE FALSE
2 b d,b,c FALSE TRUE
it does not result in what I expected, i.e. I am trying to create the column truematch
but I can only produce match
Upvotes: 1
Views: 846
Reputation: 388962
Since grepl
is not vectorised, we can use rowwise
to apply it for each row
library(dplyr)
df %>%
rowwise() %>%
mutate(truematch = grepl(id, ids, fixed = TRUE))
# id ids match truematch
# <fct> <fct> <lgl> <lgl>
#1 a b,c FALSE FALSE
#2 b d,b,c FALSE TRUE
However, rowwise
is kind of outdated, we can use purrr::map2_lgl
with grepl
df %>% mutate(truematch = purrr::map2_lgl(id, ids, grepl, fixed = TRUE))
However, for this case a better option is stringr::str_detect
which is vectorised over string and pattern
df %>% mutate(truematch = stringr::str_detect(ids, fixed(id)))
Upvotes: 2
Reputation: 3755
By using sapply
over grepl
,
df %>% mutate(match = sapply(1:nrow(.),function(x) grepl(.$id[x], .$ids[x])))
gives,
id ids match
1 a b,c FALSE
2 b d,b,c TRUE
Upvotes: 2