user2300940
user2300940

Reputation: 2385

check if cells in data frame is identical to another column

I would like to check if the names in column "Pred1" and "Pred2" are identical to the names in column "Expected" for the same row. If the names are identical it returns TRUE, else it return FALSE. I tried the identical() function, but I am not sure how to do this for each cell.

in

Expected        Pred1           Pred2
Bacteroides     Bacillus        Bacteroides
Bifidobacterium Bifidobacterium  Escherichia

out

Expected        Pred1         Pred2
Bacteroides      FALSE         TRUE
Bifidobacterium  TRUE          FALSE

Upvotes: 1

Views: 255

Answers (3)

Cole
Cole

Reputation: 11255

lapply() will loop through all of the columns that you want to check. The function used == will check equivalent with the right hand side which would be d[, 'Expected'].

lapply(d[, c('Pred1', 'Pred2')], '==', d[, 'Expected'])
#equivalent to
lapply(d[, c('Pred1', 'Pred2')], function(x) x == d[, 'Expected'])

$Pred1
[1] FALSE  TRUE

$Pred2
[1]  TRUE FALSE

To get it into the right format, you can assign them back to the original columns. Note I made a copy but you can just as easily assign the results to the original data.frame.

d_copy <- d

d_copy[, c('Pred1', 'Pred2')] <- lapply(d[, c('Pred1', 'Pred2')], '==', d[, 'Expected'])

d_copy
         Expected Pred1 Pred2
1     Bacteroides FALSE  TRUE
2 Bifidobacterium  TRUE FALSE

Upvotes: 1

jay.sf
jay.sf

Reputation: 72593

You could use outer.

fun <- Vectorize(function(x, y) identical(d[x, 1], d[x, y]))
cbind(d[1], Pred=outer(1:2, 2:3, fun))
#          Expected Pred.1 Pred.2
# 1     Bacteroides  FALSE   TRUE
# 2 Bifidobacterium   TRUE  FALSE

Or do it with ==.

sapply(1:2, function(x) d[x, 1] == d[x, 2:3])
#       [,1]  [,2]
# [1,] FALSE  TRUE
# [2,]  TRUE FALSE

Data

d <- structure(list(Expected = c("Bacteroides", "Bifidobacterium"), 
    Pred1 = c("Bacillus", "Bifidobacterium"), Pred2 = c("Bacteroides", 
    "Escherichia")), row.names = c(NA, -2L), class = "data.frame")

Upvotes: 1

fabla
fabla

Reputation: 1816

Solution using a for loop:

l <- list()
for(i in 2:length(df)){
   l[[i]] <- df[,1] == df[,i]
}
df1 <- as.data.frame(do.call(cbind,l))

Data:

df <- data.frame(Expected = c("Bacteriodes","Bifidobacterium"),Pred1 = c("Bacillus","Bifidobacterium"),Pred2 = c("Bacteriodes","Escherichia"),stringsAsFactors = F)

Upvotes: 1

Related Questions