Reputation: 199
I have 2 large data frames with identical row and col names. I would like to identify which "cells" differ. For instance, say I have tab1 and tab2
tab1 <- data.frame(name=c('arthur', 'john', 'david', 'loopy'), grade=c(1, 4, 3, 2), size=c(23, 34, 23, 13))
tab2 <- data.frame(name=c('jean', 'john', 'david', 'loopy'), grade=c(1, 4, 5, 2), size=c(23, 34, 23, 16))
I would like the function to report [1,1], [3,2], and [4,3] as discordant.
There are numbers, factors, and character values in the cells. No dates.
Any idea?
Upvotes: 0
Views: 84
Reputation: 193507
Make sure your "name" columns are character
s and not factors
, then you can simply use ==
to check for equality, or !=
to check for inequality:
> tab1$name <- as.character(tab1$name)
> tab2$name <- as.character(tab2$name)
> tab1 == tab2
name grade size
[1,] FALSE TRUE TRUE
[2,] TRUE TRUE TRUE
[3,] TRUE FALSE TRUE
[4,] TRUE TRUE FALSE
To get the positions, use which(..., arr.ind = TRUE)
.
> which(tab1 != tab2, arr.ind = TRUE)
row col
[1,] 1 1
[2,] 3 2
[3,] 4 3
Upvotes: 4