Reputation: 3396
How can I find the index of an element, when the element is determined using quantile()
?
The match()
and which()
solutions from this similar question do not work (they return NA), and I think they don't work because of rounding issues.
In the case that the quantile result is averaged/interpolated across two indices, can I specify if it takes the lower/higher index? My data x
will always be sorted.
Example Dataset (Obviously the 0 and 1 quantiles here are just the min and max, they are just shown for a sanity check)
x <- c(0.000000e+00,9.771228e-09,5.864592e-06,3.474925e-04,9.083242e-04,2.458036e-02)
quantile(x, probs = c(0, 0.5, 1))
0% 50% 100%
0.0000000000 0.0001766785 0.0245803600
How do I find the indices for these quantiles? Here, the indices are 1,??,6
.
And I guess the median is the average of two indices, so can I specific that it returns the first or second index?
Upvotes: 2
Views: 559
Reputation: 73437
You probably want type=4
which uses linear interpolation of the empirical cdf (i.e. considers the actual median).
x <- c(0.000000e+00,9.771228e-09,5.864592e-06,3.474925e-04,9.083242e-04,2.458036e-02)
(q <- quantile(x, probs=c(0, 0.5, 1), type=4))
# 0% 50% 100%
# 0.000000e+00 5.864592e-06 2.458036e-02
match(q, x)
# [1] 1 3 6
x[match(q, x)]
# [1] 0.000000e+00 5.864592e-06 2.458036e-02
Other example:
set.seed(42)
x <- runif(1e3)
(q <- quantile(x, probs=c(0, 0.5, 1), type=4))
# 0% 50% 100%
# 0.0002388966 0.4803101290 0.9984908344
match(q, x)
# [1] 92 174 917
x[match(q, x)]
# [1] 0.0002388966 0.4803101290 0.9984908344
Upvotes: 1
Reputation: 389175
Use findInterval
?
x <- c(0.000000e+00, 9.771228e-09, 5.864592e-06, 3.474925e-04,
9.083242e-04,2.458036e-02)
findInterval(quantile(x, probs = c(0, 0.5, 1)), x)
#[1] 1 3 6
Upvotes: 1