AlSub
AlSub

Reputation: 1155

Select rows based on string pattern in R

Suppose I have the next data:

df <- data.frame(name = c("TO for", "Turnover for people", "HC people", 
                          "Hello world", "beenie man", 
                          "apple", "pears", "TO is"),
                 number = c(1, 2, 3, 4, 5, 6, 7, 8))

I want to filter the df based on row string pattern, if rows for name column starts with c("TO", "Turnover", "HC") then filter else remove.

The following code gives me a warning message:

library(data.table)
test <- df[df$name %like% c("TO", "Turnover", "HC"), ]

Console output:

Warning message:
In grepl(pattern, vector, ignore.case = ignore.case, fixed = fixed) :
  el argumento 'pattern' tiene tiene longitud > 1 y sólo el primer elemento será usado

Expected output should look like this:

# name                   number
# TO for                   1
# Turnover for people      2
# HC people                3
# TO is                    8   

Is there any other way to accomplish this?

Upvotes: 2

Views: 2277

Answers (1)

akrun
akrun

Reputation: 886938

The %like% is not vectorized. We may need to loop over the pattern vector and Reduce it to a single logical vector

i1 <- Reduce(`|`, lapply(c("TO", "Turnover", "HC"), `%like%`, vector = df$name))
 df[i1,]
#                 name number
#1              TO for      1
#2 Turnover for people      2
#3           HC people      3
#8               TO is      8

Or this can be achieved with grepl by collapsing the vector into single string with |

pat <- paste(c("TO", "Turnover", "HC"), collapse= "|")
df[grepl(pat, df$name),]
#                 name number
#1              TO for      1
#2 Turnover for people      2
#3           HC people      3
#8               TO is      8

Or it can be used in %like% as well

df[df$name %like% pat,]

Upvotes: 1

Related Questions