Ensa
Ensa

Reputation: 115

R partial string matching between two elements of two vectors, anywhere within element

I'm trying to match the elements of vector (b) against the elements of vector (a), where each element of vector (b) has at its end one element of vector (a). The solution should return a vector of length(b) containing the indexes of the matches in (a).

So, for example:

a<-c('R2','R3','N_3','R1')

b<-c('sp_one_R1', 'sp_one_N_3', 'sp_two_R3')

some.function(a,b)

should give:
[1] 4 3 2 

I've investigated pmatch and grep+lapply but can't find a solution. I've also thought of splitting the elements of (b) on '_' however this character can also appear in the elements of (a) so that also won't work.

Any help much appreciated!

Upvotes: 1

Views: 1137

Answers (2)

thelatemail
thelatemail

Reputation: 93938

In base R, use sapply and then use max.col to look at which value was matched:

max.col(sapply(a, grepl, b))
#[1] 4 3 2

This works because the core sapply part returns this matrix:

sapply(a, grepl, b)
#        R2    R3   N_3    R1
#[1,] FALSE FALSE FALSE  TRUE
#[2,] FALSE FALSE  TRUE FALSE
#[3,] FALSE  TRUE FALSE FALSE

Upvotes: 3

Moderat
Moderat

Reputation: 1560

Using type safe map_dbl from purrr (link to a tutorial by Jenny BC) I get

a<-c('R2','R3','N_3','R1')
b<-c('sp_one_R1', 'sp_one_N_3', 'sp_two_R3')

myfun <- function(source_vec, dest_vec) {
  purrr::map_dbl(source_vec, ~ which(stringr::str_detect(., dest_vec))[1])
}
myfun(b, a) # 4 3 2

Upvotes: 1

Related Questions