user3664601
user3664601

Reputation: 43

unique() or duplicated() with all.equal() functionality?

I am searching for a (simple) function in R to remove duplicated elements, like unique() or duplicated() which can consider for "near equality" of numerical values like all.equal():

unique( c(0, 0))
[1] 0

works fine, but

unique( c(0, cos(pi/2)) )
[1] 0.000000e+00 6.123032e-17

does not remove the second element, although a comparison with all.equal returns TRUE:

all.equal( 0, cos(pi/2) )
[1] TRUE

Same is valid for duplicated:

duplicated( c(0, cos(pi/2)))
[1] FALSE FALSE

Any suggestions? Thanks!

Upvotes: 3

Views: 298

Answers (3)

JVL
JVL

Reputation: 656

You might also consider the zapsmall function:

x <- rep(c(1,2), each=5) + rnorm(10)/(10^rep(1:5,2))
unique(x)
# [1] 1.0571484 1.0022854 1.0014347 0.9998829 0.9999985 2.1095720 1.9888208 2.0002687 1.9999723 2.0000078

unique(zapsmall(x, digits=4))
# [1] 1.0571 1.0023 1.0014 0.9999 1.0000 2.1096 1.9888 2.0003 2.0000
unique(zapsmall(x, digits=2))
# [1] 1.06 1.00 2.11 1.99 2.00
unique(zapsmall(x, digits=0))
# [1] 1 2

Upvotes: 3

Carl Witthoft
Carl Witthoft

Reputation: 21532

You could try this code (disclaimer: from my package cgwtools)

 approxeq <- function (x, y, tolerance = .Machine$double.eps^0.5, ...) 
{
    if (length(x) != length(y)) 
        warning("x,y lengths differ. Will recycle.")
    checkit <- abs(x - y) < tolerance
    return(invisible(checkit))
}

Upvotes: 0

gagolews
gagolews

Reputation: 13056

If you'd like to take into account the absolute error, and not the relative error (as all.equal does), try:

x <- c(0, cos(pi/2), 1, 1+1e-16)
unique(x)
## [1] 0.000000e+00 6.123234e-17 1.000000e+00
(x <- x[!duplicated(round(x, 10))])
## [1] 0 1

Here we remove the elements that are the same w.r.t. a fixed (10 above) number of decimal digits.

Upvotes: 2

Related Questions