Reputation: 2979
I would like to filter a dataframe using filter() and str_detect() matching for multiple patterns without multiple str_detect() function calls. In the example below I would like to filter the dataframe df
to show only rows containing the letters a
f
and o
.
df <- data.frame(numbers = 1:52, letters = letters)
df %>%
filter(
str_detect(.$letters, "a")|
str_detect(.$letters, "f")|
str_detect(.$letters, "o")
)
# numbers letters
#1 1 a
#2 6 f
#3 15 o
#4 27 a
#5 32 f
#6 41 o
I have attempted the following
df %>%
filter(
str_detect(.$letters, c("a", "f", "o"))
)
# numbers letters
#1 1 a
#2 15 o
#3 32 f
and receive the following error
Warning message: In stri_detect_regex(string, pattern, opts_regex = opts(pattern)) : longer object length is not a multiple of shorter object length
Upvotes: 16
Views: 43841
Reputation: 447
To synthesize the accepted answer even further, one could also define a vector with search patterns of interest and concatenate those with paste
using its collapse
argument where the search criterion 'or' is defined as '|'
and the search criterion 'and' as '&'
.
This could be useful, for example, when the search patterns are automatically generated somewhere else in the script or read from a source.
#' Changing the column name of the letters column to `lttrs`
#' to avoid confusion with the built-in vector `letters`
df <- data.frame(numbers = 1:52, lttrs = letters)
search_vec <- c('a','f','o')
df %>%
filter(str_detect(lttrs, pattern = paste(search_vec, collapse = '|')))
# numbers letters
#1 1 a
#2 6 f
#3 15 o
#4 27 a
#5 32 f
#6 41 o
Upvotes: 1
Reputation: 13
Is this possible with an "&" rather an "|" (sorry dont have enough rep for comment)
Upvotes: 0
Reputation: 2979
The correct syntax to accomplish this with filter() and str_detect() would be
df %>%
filter(
str_detect(letters, "a|f|o")
)
# numbers letters
#1 1 a
#2 6 f
#3 15 o
#4 27 a
#5 32 f
#6 41 o
Upvotes: 51