Reputation: 671
For several hundred columns, I would like to check if values in a given column of df match values in list of df2.
Example data:
df <- data.frame(a=c("1","2","3","4"), b=c("1", NA,NA, "99"))
df$a <- as.character(df$a)
df$b <- as.character(df$b)
df2 <- data.frame(c=I(list(c("1","2","3"))), d=I(list(c("1","0"))))
> df
a b
1 1 1
2 2 <NA>
3 3 <NA>
4 4 99
> df2
c d
1 1, 2, 3 1, 0
I have tried the following function:
check <- function(dat1=df, dat2=df2) {
for(c in ncol(df)) {
for(r in nrow(df)) {
df[r,c] <- ifelse(df[r,c] %in% as.character(unlist(df2[1,c])),"match", "nomatch")
}
}
return(df)
}
check(df, df2)
Output:
> check(df, df2)
a b
1 1 1
2 2 <NA>
3 3 <NA>
4 4 nomatch
Desired output:
> check(df, df2)
a b
1 match match
2 match <NA>
3 match <NA>
4 nomatch nomatch
Upvotes: 0
Views: 298
Reputation: 388972
You can use Map
/mapply
:
mat <- mapply(function(x, y) ifelse(is.na(x), NA, x %in% unlist(y)), df, df2)
#You can also use replace similarly
#mat <- mapply(function(x, y) replace(x %in% unlist(y), is.na(x), NA), df, df2)
mat
# a b
#[1,] TRUE TRUE
#[2,] TRUE NA
#[3,] TRUE NA
#[4,] FALSE FALSE
Now turn these TRUE
/FALSE
values to "Match"
/"No Match"
if needed.
mat[] <- c('No match', 'match')[mat + 1]
# a b
#[1,] "match" "match"
#[2,] "match" NA
#[3,] "match" NA
#[4,] "No match" "No match"
Upvotes: 1
Reputation: 11514
Try
check <- function(dat1, dat2){
out <- ifelse(t(apply(dat1, 1, function(row) row %in% unlist(dat2))), "match", 'nomatch')
out[is.na(dat1)] <- NA
colnames(out) <- colnames(dat1)
return(as.data.frame(out))
}
check(df, df2)
check(df, df2)
a b
1 match match
2 match <NA>
3 match <NA>
4 nomatch nomatch
Upvotes: 1
Reputation: 2949
You should call the range of columns and rows not just the number of columns and rows. Also, you need to include the ifelse()
for NA values.
check <- function(dat1=df, dat2=df2) {
for(c in 2:ncol(df)) {
for(r in 1:nrow(df)) {
df[r,c] <- ifelse(df[r,c] == "<NA>", "<NA>",
ifelse(df[r,c] %in% as.character(unlist(df2[1,c])),
"match", "nomatch"))
}
}
return(df)
}
> check(df, df2)
a b
1 1 match
2 2 <NA>
3 3 <NA>
4 4 nomatch
Upvotes: 1