Reputation: 131
I have two dataframes df1 and df2 which I believe have the same data but the rows are not in the same order. How can I check that they have the same rows but perhaps in a different order?
Upvotes: 1
Views: 1798
Reputation: 3007
dplyr::all_equal
is really useful for this.
library(dplyr)
df_1 <- mtcars
df_2 <- mtcars %>% arrange(mpg) # Change row order
df_3 <- mtcars %>% select(disp, everything()) # change column order
all_equal(df_1, df_2, ignore_row_order = FALSE)
#> [1] "Same row values, but different order"
all_equal(df_1, df_3, ignore_col_order = FALSE)
#> [1] "Same column names, but different order"
all_equal(df_1, df_2, ignore_row_order = TRUE)
#> [1] TRUE
Upvotes: 1
Reputation: 887078
We can first order the two datasets based on the columns
tmp1 <- df1[do.call(order, df1),]
row.names(tmp1) <- NULL
tmp2 <- df2[do.call(order, df2),]
row.names(tmp2) <- NULL
Then use all.equal
to check if they are the same
all.equal(tmp1, tmp2, check.attributes = FALSE)
NOTE: The OP's question is about How can I check that they have the same rows but perhaps in a different order?
NOTE2: No external packages are used
Upvotes: 2