JsDart
JsDart

Reputation: 183

Checking if string contains all elements from vector? (R, dataframes, dplyr)

I have a dataframe, and in it, a column called names stores strings

name
'Ana, Mari'
'John, Doe'
'Stuart, Matthews'

I have a vector that stores a name in an unknown order. For instance,

v <- c('Ana', 'Mari')
v <- c('Mari', 'Ana')

I want to filter ALL the cells that contain ALL elements in the vector. Does anyone know a function that can do this?

I have this so far, but it's checking if the cells contain ANY of the elements in the cell (don't mind the cells containing extra elements that aren't matched to the vector, but all the elements in the vector should be contained in the cell).

df <- df %>% filter(grepl(vector, col_name))

Upvotes: 4

Views: 1437

Answers (3)

akrun
akrun

Reputation: 887128

We can use str_detect with filter

library(tidyverse)
df %>% 
     filter(str_detect(df$name, paste(v, collapse="|")))
# A tibble: 1 x 1
#       name
#      <chr>
#1 Ana, Mari

Upvotes: 0

eipi10
eipi10

Reputation: 93811

library(tidyverse) 
library(stringr)

df = data_frame(name = c('Ana, Mari','John, Doe', 'Stuart, Matthews'))
v <- c('Mari', 'Ana')

Base R:

df[sapply(strsplit(df$name, split=", "), function(str) all(v %in% str)), ]
       name
1 Ana, Mari

tidyverse:

df %>%
  group_by(name) %>%
  filter(all(v %in% str_split(name, ", ", simplify=TRUE)))
       name
1 Ana, Mari

Upvotes: 2

lukeA
lukeA

Reputation: 54237

You could do

df %>% filter(rowSums(mapply(grepl, x=list(name), pattern=v))==length(v))

Upvotes: 1

Related Questions