Reputation: 2663
I have a quick question about grep
that I can't seem to resolve. Let's say that I have a list of names: brand<-c(Brand1, Brand2, Brand3, Brand4)
. I'd like to identify whether or not any of these names occur within another string variable (var1), and then create a logical variable (T/F).
ID var1 var_filter
1 Text about Brand 1 TRUE
1 Text FALSE
1 Text about Brand 2 TRUE
1 Text about Brand 3 TRUE
1 Text FALSE
1 Text about Brand 1 TRUE
How would I go about doing this? My guess is grep
, but I'm not sure how to do it when I have an entire list of possible strings instead of a single string.
Upvotes: 0
Views: 139
Reputation: 263411
Brand1 <- "Brand 1"; Brand2 <- "Brand 2"; Brand3 <- "Brand 3"; Brand4 <- "Brand 3"
brand <- c(Brand1, Brand2, Brand3, Brand4)
dfrm$var_filter <- grepl( paste(brand, collapse="|"), dfrm$var1)
Upvotes: 1
Reputation: 61953
I use a combination of sapply
, grepl
, and any
to accomplish the task. The idea is to use grepl to find which elements in the text contain any given brand. I use sapply to do these for each of the brands. Then we use apply
with any
to identify which values in the text contained any of the brands.
brands <- c("CatJuice", "robopuppy", "DasonCo")
text <- c("nononono", "That CatJuice is great", "blargcats", "I gave the robopuppy some CatJuice")
id <- sapply(brands, grepl, text, fixed = TRUE)
# if case sensitivity is an issue
#id <- sapply(tolower(brands), grepl, tolower(text), fixed = TRUE)
apply(id, 1, any)
This is case sensitive so if that is an issue you could easily use tolower
to convert everything to lower case.
Upvotes: 1
Reputation: 3210
You can use |
in patters. Like this:
dados <- read.table(text='ID var1
1 TextaboutBrand1
1 Text
1 TextaboutBrand2
1 TextaboutBrand3
1 Text
1 TextaboutBrand1', header=TRUE, sep=' ')
grep1 <- function(x, brand) { length(grep(paste0(brand,collapse='|'), x[2])) == 1 }
apply(dados,1,grep1,brand)
Or use mapply()
...
Upvotes: 0