Reputation: 226247
I'd like to be able to compare two formulas. I'm wondering about the advantages/disadvantages/pitfalls of using identical()
, ==
, or equality of the deparsed formulas.
Consider e.g.
xx <- ~0
environment(xx) <- new.env()
xx == ~0
returns TRUE
identical(xx, ~0)
and identical(xx, ~0, ignore.environment=TRUE)
return FALSE
(the ignore.environment
argument is only applied when comparing closures)deparse(xx) == "~0"
returns TRUE
(but deparsing is almost always a bad idea ...)To try to be clear, I want the formulas to be semantically equivalent; I don't care about their environments. It would be a bonus to be able to do formula expansion and ignore order of terms (e.g. so ~a*b
and ~b+a+a:b
would be equivalent), but that's too much of a rabbit hole to worry about. I will settle for ~a+b
and ~b+a
being non-equivalent, as long as ~a+b
(environment 1) and ~a+b
(environment 2) are the same.
It did occur to me to write a comparison function that replaces the environment of both values with emptyenv()
, then using identical()
, but that seemed convoluted.
Are there edge cases/reasons I shouldn't just use ==
here?
Upvotes: 4
Views: 555
Reputation: 76450
When identical
returns FALSE
I always try all.equal
, it's much less strict. Quoting its help page:
all.equal(x, y)
is a utility to compare R objectsx
andy
testing ‘near equality’. If they are different, comparison is still made to some extent, and a report of the differences is returned.
xx <- ~0
environment(xx) <- new.env()
all.equal(xx, ~0)
#[1] TRUE
But in if
statements this shouldn't be used as is, the right way would be isTRUE(all.equal(.))
. Again from the documentation:
Do not use
all.equal
directly in if expressions—either useisTRUE(all.equal(....))
oridentical
if appropriate.
isTRUE(all.equal(xx, ~0))
#[1] TRUE
Upvotes: 2