kpm
kpm

Reputation: 129

Partial string matching for multiple elements in R

Is it possible to partially match a string containing multiple elements with another string with multiple elements and return TRUE or FALSE and not a vector?

Grepl() can evaluate one element of a string, for example:

dat <- data.frame(x="1", y= "1 2 3")

>grepl(dat$x, dat$y)
[1] TRUE

But when there are multiple elements, I can't seem to find a solution where I can get a single TRUE or FALSE evaluation. I applied the same solution by @r2evans from a different problem:

dat <- data.frame(x="1 2", y= "1 6 7 8")

> mapply(`%in%`, strsplit(dat$x, "\\D+"), strsplit(dat$y, "\\D+"))
      [,1]
[1,]  TRUE
[2,] FALSE

But in this case (if I am understanding this correctly), it is evaluating each element in dat$x and returning TRUE or FALSE for each element. This is what I want, but to return only a single TRUE or FALSE statement, such that if any or all elements in dat$x is present in dat$y, return TRUE or FALSE and FALSE if none are present in dat$y such as below:

dat <- data.frame(x=c("1 2", "3 6 7", "8 5"), y=c("1 6 7 8", "2 9 10", "8 5 3"), result=c(TRUE, FALSE, TRUE))

      x       y result
1   1 2 1 6 7 8   TRUE #Where 1 is present in y (partial)
2 3 6 7  2 9 10  FALSE #Where none is present 
3   8 5   8 5 3   TRUE #where both 8 and 5 are present (full)

I have tried to use paste0 and collapse='|' but I don't think my syntax is correct, since the first row evaluation should be TRUE.

dat <- data.frame(x=c("1 2", "3 6 7", "8 5"), y=c("1 6 7 8", "2 9 10", "8 5 3"))

grepl(paste0(dat$x, collapse='|'), dat$y)
[1] FALSE FALSE  TRUE

Any clarification would be greatly appreciated!

Upvotes: 1

Views: 704

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

You can use any in mapply to return a single TRUE or FALSE value. Using any will return TRUE if there is more than 0 matches in a string.

mapply(function(x, y) any(x %in% y), 
       strsplit(dat$x, "\\s+"), strsplit(dat$y, "\\s+"))

#[1]  TRUE FALSE  TRUE

Upvotes: 1

Related Questions