Compare two data frames with same columns in R

Question

I'm looking to compare two data frames in R and basically see where differences exist. Most of the answers I've found don't offer the solution in a format I'm looking for (e.g., found many answers comparing 1 column only, comparing only numeric value, showing only the total number of changes).

I've tried waldo::compare(tbl_1,tbl_2), and that got me close, but the output isn't formatted in a manner to share with non-R users. The output format also isn't ideal for large data frames.

I've also tried using anti_join from dplyr, which got me the closest to the ideal output; however, the anti_join solution doesn't show which variables changed, it only lets me know if there was some sort of the change in any given row. My tables look like the below, but with significantly more columns and rows. Each table has the same columns and same number of rows. I'm ideally trying to get an output that matches the last table below (tbl_diff), but this is proving to be less straightforward than expected.

join1 <- anti_join(tbl_1, tbl_2)
join2 <- anti_join(tbl_2, tbl_1)
tbl_diff <- left_join(tbl_1, tbl_2)

tbl_1:

id	state	widget	count	color
_bxhd7	IL	widget_1	5	green
_ex9un	MA	widget_1	2	brown
_aolm0	CA	widget_3	3	yellow
_m017e	FL	widget_7	8	orange

tbl_2:

id	state	widget	count	color
_bxhd7	TX	widget_1	5	green
_ex9un	MA	widget_1	20	brown
_aolm0	CA	widget_3	3	yellow
_m017e	FL	widget_2	8	blue

tbl_diff (ideal output):

id	state	widget	count	color
_bxhd7	FALSE	TRUE	TRUE	TRUE
_ex9un	TRUE	TRUE	FALSE	TRUE
_aolm0	TRUE	TRUE	TRUE	TRUE
_m017e	TRUE	FALSE	TRUE	FALSE

Compare two data frames with same columns in R

Answers (1)

Related Questions