Reputation: 4092
How do I check if two objects, e.g. dataframes, are value equal in R?
By value equal, I mean the value of each row of each column of one dataframe is equal to the value of the corresponding row and column in the second dataframe.
Upvotes: 62
Views: 92612
Reputation: 1732
Here is another method using comparedf
from the arsenal
package.
It gives you the differences detected by variable, the variables not shared (different columns, for example), the number of observations not share as well as a summary of the overall comparison.
df1 <- data.frame(id = paste0("person", 1:3),
a = c("a", "b", "c"),
b = c(1, 3, 4))
> df1
id a b
1 person1 a 1
2 person2 b 3
3 person3 c 4
df2 <- data.frame(id = paste0("person", 4:1),
a = c("c", "b", "a", "f"),
b = c(1, 3, 4, 4),
d = paste0("rn", 1:4))
> df2
id a b d
1 person4 c 1 rn1
2 person3 b 3 rn2
3 person2 a 4 rn3
4 person1 f 4 rn4
library(arsenal)
comparedf(df1, df2)
Compare Object
Function Call:
comparedf(x = df1, y = df2)
Shared: 3 non-by variables and 3 observations.
Not shared: 1 variables and 0 observations.
Differences found in 2/3 variables compared.
0 variables compared have non-identical attributes.
There is a possibility to get a more detailed summary
.
summary(comparedf(df1, df2))
The code below will return several tables:
Here you have more info about the package and the function.
Additionally, you can use all.equal(df1, df2)
too.
[1] "Attributes: < Component “row.names”: Numeric: lengths (3, 4) differ >"
[2] "Length mismatch: comparison on first 3 components"
[3] "Component “id”: Lengths (3, 4) differ (string compare on first 3)"
[4] "Component “id”: 3 string mismatches"
[5] "Component “a”: Lengths (3, 4) differ (string compare on first 3)"
[6] "Component “a”: 2 string mismatches"
[7] "Component “b”: Numeric: lengths (3, 4) differ"
Upvotes: 1
Reputation: 5269
Without the need to rely on another package, but to compare structure (class and attributes) of two data sets:
structure_df1 <- sapply(df1, function(x) paste(class(x), attributes(x), collapse = ""))
structure_df2 <- sapply(df2, function(x) paste(class(x), attributes(x), collapse = ""))
all(structure_df1 == structure_df2)
Upvotes: 0
Reputation: 5000
We can use the R package compare
to test whether the names of the object and the values are the same, in just one step.
a <- data.frame(x = 1:10)
b <- data.frame(y = 1:10)
library(compare)
compare(a, b)
#FALSE [TRUE]#objects are not identical (different names), but values are the same.
In case we only care about equality of the values, we can set ignoreNames=TRUE
compare(a, b, ignoreNames=T)
#TRUE
# dropped names
The package has additional interesting functions such as compareEqual
and compareIdentical
.
Upvotes: 9
Reputation: 681
In addition, identical is still useful and supports the practical goal:
identical(a[, "x"], b[, "y"]) # TRUE
Upvotes: 14
Reputation: 31761
It is not clear what it means to test if two data frames are "value equal" but to test if the values are the same, here is an example of two non-identical dataframes with equal values:
a <- data.frame(x = 1:10)
b <- data.frame(y = 1:10)
To test if all values are equal:
all(a == b) # TRUE
To test if objects are identical (they are not, they have different column names):
identical(a,b) # FALSE: class, colnames, rownames must all match.
Upvotes: 73