ALollz
ALollz

Reputation: 59549

Issue with quantile type 2

I don't understand the following behavior with quantile. With type=2 it should average at discontinuities, but this doesn't seem to happen always. If I create a list of 100 numbers and look at the percentiles, then shouldn't I take the average at every percentile? This behavior happens for some, but not for all (i.e. 7th percentile).

quantile(seq(1, 100, 1), 0.05, type=2)
# 5%
# 5.5 

quantile(seq(1, 100, 1), 0.06, type=2)
# 6%
# 6.5 

quantile(seq(1, 100, 1), 0.07, type=2)
# 7%
# 8 

quantile(seq(1, 100, 1), 0.08, type=2)
# 8%
# 8.5 

Is this related to floating point issues?

100*0.06 == 6
#TRUE

100*0.07 == 7 
#FALSE

sprintf("%.20f", 100*0.07)
#"7.00000000000000088818"

Upvotes: 1

Views: 163

Answers (1)

Anders Ellern Bilgrau
Anders Ellern Bilgrau

Reputation: 10223

As far as I can tell, it is related to floating points as 0.07 is not exactly representable with floating points.

p <- seq(0, 0.1, by = 0.001)
q <- quantile(seq(1, 100, 1), p, type=2)
plot(p, q, type = "b")
abline(v = 0.07, col = "grey")

enter image description here

If you think of the quantile (type 2) as a function of p, you will never evaluate the function at exactly 0.07, hence your results.Try e.g. decreasing by in the above. In that sense, the function returns exactly as expected. In practice with continuous data, I cannot imagine it would be of any consequence (but that is a poor argument I know).

Upvotes: 2

Related Questions