B.Mr.W.
B.Mr.W.

Reputation: 19648

R filter by contains substring multiple conditions

Hey I have a list 500K rows that I need to filter by a condition where it must contains certain substrings (another list of 20 substrings).

I am using dplyr package right now and my code looks like this:

result <- data %>% 
          filter( grepl('sub1', column1) ||
                  grepl('sub2', column1) ||
                  grepl('sub3', column1) ||
                  grepl('sub4', column1) ||
                  ...
                  grepl('sub20', column1)) 

This whole thing is really killing me as the second list get longer, I am wondering is there an easy(or shorter?) way of doing this.

Upvotes: 1

Views: 1135

Answers (1)

akrun
akrun

Reputation: 887901

We can paste the pattern strings together and collapse it by |

library(dplyr)
data %>% 
     filter(grepl(paste(paste0('sub', 1:20), collapse="|"), column1))  

Upvotes: 1

Related Questions