owensmartin
owensmartin

Reputation: 393

Is there a bug in `match' in R?

How is this possible:

> match(1.68, seq(0.01,10, by = .01))
[1] 168
> match(1.67, seq(0.01,10, by = .01))
[1] NA

Does the R function match have a bug in it?

Upvotes: 4

Views: 273

Answers (2)

Josh O'Brien
Josh O'Brien

Reputation: 162321

For this type of problem, I prefer the solution described by Chambers in his book 'Software for Data Analysis':

match(1.68, seq(1, 1000, by = 1)/100)
# [1] 168
match(1.67, seq(1, 1000, by = 1)/100)
# [1] 167

(It works because there are no floating point issues involved in producing a sequence of integers. The rounding only occurs upon division by 100, and matches the rounding produced by converting the typed number 1.67 to binary.)

This solution has the virtue of not finding a match for a number like 1.6744, which is clearly not in the sequence 0.10, 0.11, 0.12, ..., 9.98, 9.99, 10.00:

match(1.6744, seq(1,1000, by = 1)/100)
# [1] NA                               ## Just as I'd like it!

Upvotes: 6

IRTFM
IRTFM

Reputation: 263352

Typical R-FAQ 7.31 problem. Not a bug. To avoid this common user error, use instead the function findInterval and fuzz the boundaries down a bit. (or do your selections on integer sequences.)

> findInterval(1.69, seq(0.01,10, by = .01))
[1] 169
> findInterval(1.69, seq(0.01,10, by = .01)-.0001)
[1] 169
> findInterval(1.68, seq(0.01,10, by = .01)-.0001)
[1] 168
> findInterval(1.67, seq(0.01,10, by = .01)-.0001)
[1] 167
> findInterval(1.66, seq(0.01,10, by = .01)-.0001)
[1] 166

Upvotes: 7

Related Questions