Reputation: 2443
I have two data frames:
df = structure(list(x = c(NA, NA, "b", "b", "b"), y = c("f", "f",
"f", "g", "g")), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
df2 = structure(list(x = c(NA, NA, "a", "b", "b"), y = c("g", "f",
"f", "g", "g")), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
I would like to find the identical rows, when considering NA as a value.
df == df2
According to this, the second rows should be "TRUE". Instead we get NA. Although the logic for this is clear, can we modify df == df2
so that these rows would be considered equal?
Upvotes: 1
Views: 46
Reputation: 886938
One option would be to replace
the NA with a value not in the datasets, do the comparison, and check if all the rows are equal with rowSums
rowSums(replace(df2, is.na(df2), "0") == replace(df, is.na(df), "0"))== 2
#[1] FALSE TRUE FALSE TRUE TRUE
Or without replacing, create a logical condition with is.na
rowSums((!is.na(df) & df== df2)|(is.na(df))) == ncol(df)
Upvotes: 1
Reputation: 51582
You can paste
and compare, i.e.
do.call(paste, df) == do.call(paste, df2)
#[1] FALSE TRUE FALSE TRUE TRUE
Upvotes: 1