Reputation: 39
I have a dataframe in R, and I want to check for any record in a vector that finds matches for the string in the DF. I can't seem to get it to work exactly right.
exampledf=as.data.frame(c("PIT","SLC"))
colnames(exampledf)="Column1"
examplevector=c("PITTPA","LAXLAS","JFKIAH")
This gets me close, but the result is a vector of (1,0,0) instead of a 0 or 1 for each row
exampledf$match=by(exampledf,1:nrow(exampledf),function(row) ifelse(grepl(exampledf$Column1,examplevector),1,0))
Expected result:
exampledf$match=c("1","0")
Upvotes: 1
Views: 282
Reputation: 2431
grepl
returns a logical vector the same length as your examplevector
. You can wrap it with the any()
function (equivalent to using sum()
as suggested above).
Here's a slightly modified form of your code:
exampledf$match = vapply(exampledf$Column1, function(x) any(grepl(x, examplevector)), 1L)
Upvotes: 2
Reputation: 1354
So here is my solution:
library(dplyr)
exampledf=as.data.frame(c("PIT","SLC"))
colnames(exampledf)="Column1"
examplevector=c("PITTPA","LAXLAS","JFKIAH")
pmatch does what you want and gives you which example vector it matches to. Use duplicates.ok because you want multiple matches to show up. If you dont want that, then make the argument equal to false. I just used dpylr to create the new column but you can do this however you would like.
exampledf %>% mutate(match_flag = ifelse(is.na(pmatch(Column1, examplevector, duplicates.ok = T)),0
, pmatch(Column1, examplevector, duplicates.ok = T)))
Column1 match_flag
1 PIT 1
2 SLC 0
Upvotes: 1