mindless.panda
mindless.panda

Reputation: 4092

Compare if two dataframe objects in R are equal?

How do I check if two objects, e.g. dataframes, are value equal in R?

By value equal, I mean the value of each row of each column of one dataframe is equal to the value of the corresponding row and column in the second dataframe.

Upvotes: 62

Views: 92612

Answers (5)

emr2
emr2

Reputation: 1732

Here is another method using comparedf from the arsenal package.

It gives you the differences detected by variable, the variables not shared (different columns, for example), the number of observations not share as well as a summary of the overall comparison.

df1 <- data.frame(id = paste0("person", 1:3),
                  a = c("a", "b", "c"),
                  b = c(1, 3, 4))

> df1
         id     a       b 
1     person1   a       1 
2     person2   b       3
3     person3   c       4


df2 <- data.frame(id = paste0("person", 4:1),
                  a = c("c", "b", "a", "f"),
                  b = c(1, 3, 4, 4),
                  d = paste0("rn", 1:4))

> df2

        id     a     b     d

1     person4  c     1    rn1
2     person3  b     3    rn2
3     person2  a     4    rn3
4     person1  f     4    rn4


library(arsenal)
comparedf(df1, df2)

Compare Object
Function Call: 
comparedf(x = df1, y = df2)

Shared: 3 non-by variables and 3 observations.
Not shared: 1 variables and 0 observations.

Differences found in 2/3 variables compared.
0 variables compared have non-identical attributes.

There is a possibility to get a more detailed summary.

 summary(comparedf(df1, df2))

The code below will return several tables:

  • Summary of data.frames
  • Summary of overall comparison
  • Variables not shared
  • Other variables not compared
  • Observations not shared
  • Differences detected by variable
  • Differences detected
  • Non-identical attributes

Here you have more info about the package and the function.

Additionally, you can use all.equal(df1, df2) too.

[1] "Attributes: < Component “row.names”: Numeric: lengths (3, 4) differ >"
[2] "Length mismatch: comparison on first 3 components"                    
[3] "Component “id”: Lengths (3, 4) differ (string compare on first 3)"    
[4] "Component “id”: 3 string mismatches"                                  
[5] "Component “a”: Lengths (3, 4) differ (string compare on first 3)"     
[6] "Component “a”: 2 string mismatches"                                   
[7] "Component “b”: Numeric: lengths (3, 4) differ"

Upvotes: 1

MS Berends
MS Berends

Reputation: 5269

Without the need to rely on another package, but to compare structure (class and attributes) of two data sets:

structure_df1 <- sapply(df1, function(x) paste(class(x), attributes(x), collapse = ""))
structure_df2 <- sapply(df2, function(x) paste(class(x), attributes(x), collapse = ""))

all(structure_df1 == structure_df2)

Upvotes: 0

milan
milan

Reputation: 5000

We can use the R package compare to test whether the names of the object and the values are the same, in just one step.

a <- data.frame(x = 1:10)
b <- data.frame(y = 1:10)

library(compare)
compare(a, b)
#FALSE [TRUE]#objects are not identical (different names), but values are the same.

In case we only care about equality of the values, we can set ignoreNames=TRUE

compare(a, b, ignoreNames=T)
#TRUE
#  dropped names

The package has additional interesting functions such as compareEqual and compareIdentical.

Upvotes: 9

Brad Horn
Brad Horn

Reputation: 681

In addition, identical is still useful and supports the practical goal:

identical(a[, "x"], b[, "y"]) # TRUE

Upvotes: 14

David LeBauer
David LeBauer

Reputation: 31761

It is not clear what it means to test if two data frames are "value equal" but to test if the values are the same, here is an example of two non-identical dataframes with equal values:

a <- data.frame(x = 1:10)
b <- data.frame(y = 1:10)

To test if all values are equal:

all(a == b) # TRUE

To test if objects are identical (they are not, they have different column names):

identical(a,b) # FALSE: class, colnames, rownames must all match.

Upvotes: 73

Related Questions