Reputation: 138
I'm working with R and i'm trying to clean my data using. I have the next data:
example<- data.frame(x=c("hungry","fly","day","dog"),
y=c("i'm hungry","i believe i can fly","a hard day's night","cat"))
I'm trying to identify if the Y column contains the characters from the column X. I try with grepl() but that function doesn't work with vectors and i try with str_detect()
but i don't know why doesn't work. I finally try to get the next table:
x y Flag
1 hungry i'm hungry 1
2 fly i believe i can fly 1
3 day a hard day's night 1
4 dog cat 0
I wonder if someone can give me some option or other view to do it.
Thank!
Upvotes: 2
Views: 1148
Reputation: 101538
You can try Vectorize
to make grepl
vercorized, e.g.,
example <- within(example,Flag <- +Vectorize(grepl)(x,y))
such that
> example
x y Flag
1 hungry i'm hungry 1
2 fly i believe i can fly 1
3 day a hard day's night 1
4 dog cat 0
Upvotes: 2
Reputation: 21400
You can use grepl
and ifelse
in this way:
example$Flag <- ifelse(grepl(paste0(example$x, collapse = "|"), example$y), 1, 0)
Using paste0
, this collapses example$x
into a single pattern with alternatives separated by |
and has grepl
check whether the pattern complex matches the values in example$y
: if the match is found, the ifelse
statement assigns 1
; if it doesn't, 0
.
Alternatively, you can use str_detect
from package stringr
: Note that the order in which you assign the two variables to the function matters--you need to put the larger strings (i.e. those in example$y
) first AND you need to convert both variables to character. On the upside though there's no need for the paste0
transformation:
example$Flag <- ifelse(str_detect(as.character(example$y), as.character(example$x)), 1, 0)
Result:
example
x y Flag
1 hungry i'm hungry 1
2 fly i believe i can fly 1
3 day a hard day's night 1
4 dog cat 0
Upvotes: 1
Reputation: 5788
No where near as succinct as @jogo's answer but:
sapply(split(example, rownames(example)),
function(z){grepl(as.character(z$x), as.character(z$y))})
Upvotes: 1