Reputation: 43
I am searching for a (simple) function in R to remove duplicated elements, like unique()
or duplicated()
which can consider for "near equality" of numerical values like all.equal()
:
unique( c(0, 0))
[1] 0
works fine, but
unique( c(0, cos(pi/2)) )
[1] 0.000000e+00 6.123032e-17
does not remove the second element, although a comparison with all.equal returns TRUE:
all.equal( 0, cos(pi/2) )
[1] TRUE
Same is valid for duplicated:
duplicated( c(0, cos(pi/2)))
[1] FALSE FALSE
Any suggestions? Thanks!
Upvotes: 3
Views: 298
Reputation: 656
You might also consider the zapsmall
function:
x <- rep(c(1,2), each=5) + rnorm(10)/(10^rep(1:5,2))
unique(x)
# [1] 1.0571484 1.0022854 1.0014347 0.9998829 0.9999985 2.1095720 1.9888208 2.0002687 1.9999723 2.0000078
unique(zapsmall(x, digits=4))
# [1] 1.0571 1.0023 1.0014 0.9999 1.0000 2.1096 1.9888 2.0003 2.0000
unique(zapsmall(x, digits=2))
# [1] 1.06 1.00 2.11 1.99 2.00
unique(zapsmall(x, digits=0))
# [1] 1 2
Upvotes: 3
Reputation: 21532
You could try this code (disclaimer: from my package cgwtools
)
approxeq <- function (x, y, tolerance = .Machine$double.eps^0.5, ...)
{
if (length(x) != length(y))
warning("x,y lengths differ. Will recycle.")
checkit <- abs(x - y) < tolerance
return(invisible(checkit))
}
Upvotes: 0
Reputation: 13056
If you'd like to take into account the absolute error, and not the relative error (as all.equal
does), try:
x <- c(0, cos(pi/2), 1, 1+1e-16)
unique(x)
## [1] 0.000000e+00 6.123234e-17 1.000000e+00
(x <- x[!duplicated(round(x, 10))])
## [1] 0 1
Here we remove the elements that are the same w.r.t. a fixed (10 above) number of decimal digits.
Upvotes: 2