Ben Bolker
Ben Bolker

Reputation: 226247

how to best compare formulas?

I'd like to be able to compare two formulas. I'm wondering about the advantages/disadvantages/pitfalls of using identical(), ==, or equality of the deparsed formulas.

Consider e.g.

xx <- ~0
environment(xx) <- new.env()

To try to be clear, I want the formulas to be semantically equivalent; I don't care about their environments. It would be a bonus to be able to do formula expansion and ignore order of terms (e.g. so ~a*b and ~b+a+a:b would be equivalent), but that's too much of a rabbit hole to worry about. I will settle for ~a+b and ~b+a being non-equivalent, as long as ~a+b (environment 1) and ~a+b (environment 2) are the same.

It did occur to me to write a comparison function that replaces the environment of both values with emptyenv(), then using identical(), but that seemed convoluted.

Are there edge cases/reasons I shouldn't just use == here?

Upvotes: 4

Views: 555

Answers (1)

Rui Barradas
Rui Barradas

Reputation: 76450

When identical returns FALSE I always try all.equal, it's much less strict. Quoting its help page:

all.equal(x, y) is a utility to compare R objects x and y testing ‘near equality’. If they are different, comparison is still made to some extent, and a report of the differences is returned.

xx <- ~0
environment(xx) <- new.env()

all.equal(xx, ~0)
#[1] TRUE

But in if statements this shouldn't be used as is, the right way would be isTRUE(all.equal(.)). Again from the documentation:

Do not use all.equal directly in if expressions—either use isTRUE(all.equal(....)) or identical if appropriate.

isTRUE(all.equal(xx, ~0))
#[1] TRUE

Upvotes: 2

Related Questions