Kkyr
Kkyr

Reputation: 55

Grep multiple string statements

In order to remove a specific string from multiple columns we must use this:

df1 <- with(df, df[ grepl( 'word1', df$Col1) | grepl( 'word1', df$Col2) | grepl( 'word1', df$Col3), ])

If we have more than one string like this:

df1 <- with(df, df[ grepl( 'word1', df$Col1) | grepl( 'word1', df$Col2) | grepl( 'word1', df$Col3), ])
df2 <- with(df, df[ grepl( 'word2', df$Col1) | grepl( 'word2', df$Col2) | grepl( 'word2', df$Col3), ])

How is it possible to have one call instead of many for 'word1; and 'word2' be into one ine?

Upvotes: 0

Views: 188

Answers (1)

Thierry
Thierry

Reputation: 18487

First you need the combined regex. You can test it at https://regex101.com/ Then you can use apply() to run it on each column. This will yield a matrix of TRUE or FALSE values. 1 row per variable, 1 column per observation. You can apply() any() on that matrix to get the selection.

test <- apply(df, 2, grepl, pattern = "word1|word2")
df[apply(test, 2, any), ]

Upvotes: 2

Related Questions