Adam Sanders
Adam Sanders

Reputation: 39

For each row in DF, check if there is a match in a vector

I have a dataframe in R, and I want to check for any record in a vector that finds matches for the string in the DF. I can't seem to get it to work exactly right.

exampledf=as.data.frame(c("PIT","SLC"))
colnames(exampledf)="Column1"
examplevector=c("PITTPA","LAXLAS","JFKIAH")

This gets me close, but the result is a vector of (1,0,0) instead of a 0 or 1 for each row

exampledf$match=by(exampledf,1:nrow(exampledf),function(row) ifelse(grepl(exampledf$Column1,examplevector),1,0))

Expected result:

exampledf$match=c("1","0")

Upvotes: 1

Views: 282

Answers (2)

David Klotz
David Klotz

Reputation: 2431

grepl returns a logical vector the same length as your examplevector. You can wrap it with the any() function (equivalent to using sum() as suggested above).

Here's a slightly modified form of your code:

exampledf$match = vapply(exampledf$Column1, function(x) any(grepl(x, examplevector)), 1L)

Upvotes: 2

Adam Warner
Adam Warner

Reputation: 1354

So here is my solution:

library(dplyr)
exampledf=as.data.frame(c("PIT","SLC"))
colnames(exampledf)="Column1"
examplevector=c("PITTPA","LAXLAS","JFKIAH")

pmatch does what you want and gives you which example vector it matches to. Use duplicates.ok because you want multiple matches to show up. If you dont want that, then make the argument equal to false. I just used dpylr to create the new column but you can do this however you would like.

exampledf %>% mutate(match_flag = ifelse(is.na(pmatch(Column1, examplevector, duplicates.ok = T)),0
                                         , pmatch(Column1, examplevector, duplicates.ok = T)))

   Column1 match_flag
1     PIT          1
2     SLC          0

Upvotes: 1

Related Questions