Min
Min

Reputation: 179

Floating point issue when using %in%

I'm having difficulties in using %in% when dealing with floating point issue, e.g.

> x = seq(0.05, 0.3, 0.01)
> x %in% seq(0.15, 0.3, 0.01)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[25] FALSE  TRUE

I know it is because how computer stores floating points, but is there a function like dplyr::near which could be used to replace %in%? dplyr::near(x, y) won't work if length of x is different from y.

Many thanks!

Upvotes: 3

Views: 59

Answers (3)

jay.sf
jay.sf

Reputation: 72974

Transforming as.character.

as.character(x) %in% as.character(seq(0.15, 0.3, 0.01))
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# [10] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
# [19]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

This also seems to work fin for more complicated cases. Consider:

x <- c(.2999, .3, .2499, .25)
y <- c(.299, .3, .249, .25)

as.character(x) %in% as.character(y)
# [1] FALSE  TRUE FALSE  TRUE

When rounding, we need to calculate the digits correctly to generalize,

round(x, 3) %in% round(y, 3)
# [1] TRUE TRUE TRUE TRUE
round(x, 4) %in% round(y, 4)
# [1] FALSE  TRUE FALSE  TRUE

which can be automated:

d <- max(nchar(c(x, y))) - 2
round(x, d) %in% round(y, d)
# [1] FALSE  TRUE FALSE  TRUE

We could wrap both solutions into a function:

`%in2%` <- function(x, y) {
  d <- max(nchar(c(x, y))) - 2
  round(x, d) %in% round(y, d)
}
`%in3%` <- function(x, y) {
  as.character(x) %in% as.character(y)
}
x %in2% y
# [1] FALSE  TRUE FALSE  TRUE
x %in3% y
# [1] FALSE  TRUE FALSE  TRUE

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

You could use dplyr::near here but since near does pairwise comparison and you need to compare with any value in vector use sapply.

check_values <- seq(0.15, 0.3, 0.01)
sapply(x, function(x) any(dplyr::near(x, check_values)))

#[1]  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
#[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#[25]  TRUE  TRUE

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521419

Using floats rounded to two decimal places seems to work:

x <- round(seq(0.05, 0.3, 0.01), 2)
x %in% round(seq(0.15, 0.3, 0.01), 2)

                                                                 ^^ 0.15
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[25]  TRUE  TRUE   <-- 0.3

Upvotes: 1

Related Questions