Reputation: 345
How to find the cells that are different in two data frames?
df1 =structure(list(Name = c(10746359L, 11034174L, 10279660L, 10127534L,
10956764L, 10172699L, 10689723L, 10966980L, 10497750L, 10833372L,
10077477L), Green = c(98L, 86L, 15L, 29L, 77L, 87L, 83L, 79L,
75L, 46L, 40L), Blue = c(23L, 82L, 48L, 19L, 13L, 41L, 70L, 78L,
100L, 76L, 75L), Red = c(78L, 55L, 14L, 100L, 59L, 40L, 67L,
70L, 19L, 39L, 83L), Orange = c(1L, 75L, 17L, 14L, 74L, 53L,
53L, 78L, 60L, 27L, 86L), Yellow = c("Berlin", "London", "Frankfurt",
"Beijing", "New York", "Chicago", "Auckland", "Sydney", "Paris",
"Barcelona", "Madrid"), Violet = c(0.558015352, 0.997666691,
0.035279025, 0.921518397, 0.172728814, 0.772205286, 0.390398637,
0.362153606, 0.650357655, 0.606278069, 0.442747248)), class = "data.frame", row.names = c(NA,
-11L))
df2 = structure(list(Name = c(10746359L, 11034174L, 10279660L, 10127534L,
10956764L, 10172699L, 10689723L, 10966980L, 10497750L, 10833372L,
10077477L), Green = c(98L, 86L, 15L, 29L, 77L, 87L, 83L, 79L,
75L, 46L, 40L), Blue = c(23L, 82L, 48L, 19L, 13L, 41L, 70L, 42L,
100L, 76L, 75L), Red = c(78L, 55L, 14L, 100L, 59L, 40L, 67L,
70L, 19L, 39L, 83L), Orange = c(1L, 75L, 17L, 14L, 74L, 53L,
53L, 78L, 60L, 27L, 86L), Yellow = c("Berlin", "Melbourne", "Frankfurt",
"Beijing", "New York", "Chicago", "Auckland", "Sydney", "Paris",
"Barcelona", "Madrid"), Violet = c(0.558015352, 0.997666691,
0.035279025, 0.921518397, 0.172728814, 0.772205286, 0.390398637,
0.362153606, 0.650357655, 0.606278069, 0.442747248)), class = "data.frame", row.names = c(NA,
-11L))
Cells that are different in the two data frames are:
I tried setdiff
, but it shows me an entire row.
Upvotes: 1
Views: 39
Reputation: 73265
We can use
ij <- which(df1 != df2, arr.ind = TRUE)
# row col
#[1,] 8 3
#[2,] 2 6
If you prefer to column names, then
data.frame(row = ij[, 1], col = names(df1)[ij[, 2]])
# row col
#1 8 Blue
#2 2 Yellow
Of course, before doing !=
you'd better ensure that
identical(dim(df1), dim(df2))
is TRUE;
identical(names(df1), names(df2))
is TRUE;
identical(sapply(df1, class), sapply(df2, class))
is TRUE.
Upvotes: 2