Reputation: 197
I have a character string vector that I would like to filter based on keywords from a second vector.
Below is a small reprex:
list1 <- c("I like apples", "I eat bread", "Bananas are my favorite")
fruit <- c("apple","banana")
I am presuming I will be needing to use stringr
/stringi
, but I would, in essence, like to do something alongs the lines of list1 %in% fruit
and it return T,F,T
.
Any suggestions?
Upvotes: 0
Views: 398
Reputation: 21400
A solution with str_dectect
:
library(tidyverse)
data.frame(list1) %>%
mutate(Flag = str_detect(list1, paste0("(?i)", paste0(fruit, collapse = "|"))))
list1 Flag
1 I like apples TRUE
2 I eat bread FALSE
3 Bananas are my favorite TRUE
If you want to filter
(i.e. subset) your data:
data.frame(list1) %>%
filter(str_detect(list1, paste0("(?i)", paste0(fruit, collapse = "|"))))
list1
1 I like apples
2 Bananas are my favorite
Note that (?i)
is used to make the match case-insensitive.
EDIT:
To record the matches in a separate column you can use str_extract
(if you expect to have just one match per string) or str_extract_all
(for more than one matches):
data.frame(list1) %>%
mutate(Flag = str_detect(list1, paste0("(?i)", paste0(fruit, collapse = "|"))),
Match = str_extract_all(list1, paste0("(?i)", paste0(fruit, collapse = "|"))))
list1 Flag Match
1 I like apples TRUE apple
2 I eat bread FALSE
3 Bananas are my favorite TRUE Banana
Upvotes: 2
Reputation: 19097
We can do this with grepl
without using external packages.
grepl
can handle multiple patterns separated by |
, therefore we can first concatenate the strings in fruit
together with |
as the separator.
Remember to set ignore.case = TRUE
if you don't care about case (note the "banana" in your example has different case).
grepl(paste(fruit, collapse = "|"), list1, ignore.case = T)
[1] TRUE FALSE TRUE
Or use grep
to directly output the string that match:
# same as list1[grepl(paste(fruit, collapse = "|"), list1, ignore.case = T)]
grep(paste(fruit, collapse = "|"), list1, ignore.case = T, value = T)
[1] "I like apples" "Bananas are my favorite"
Upvotes: 2