Reputation: 151
As we know, quantile
function is the inverse cumulative distribution function.
Then for an existed distribute(a vector), how to exactly match the result of cumulative distribution function
and quantile
function?
Here is an example given in MATLAB.
a = [150 154 151 153 124]
[x_count, x_val] = hist(a, unique(a));
% compute the probability cumulative distribution
p = cumsum(n)/sum(n);
x_out = quantile(a, p)
In the cumulative distribution function, the corresponding relation between cumulative probability and x value should be:
x = 124 150 151 153 154
p = 0.2000 0.4000 0.6000 0.8000 1.0000
But use p and quantile to compute x_out, the result is different with x:
x_out =
137.0000 150.5000 152.0000 153.5000 154.0000
Reference
Upvotes: 0
Views: 466
Reputation: 45752
From the docs:
For a data vector of five elements such as {6, 3, 2, 10, 1}, the sorted elements {1, 2, 3, 6, 10} respectively correspond to the 0.1, 0.3, 0.5, 0.7, 0.9 quantiles.
So if you wanted to get the exact numbers out that you put in for x
, and your x
has 5 elements then your p
needs to be p = [0.1, 0.3, 0.5, 0.7, 0.9]
. The complete algorithm is explicitly defined in the documentation.
You have assumed that to get x
back, p
should have been [0.2, 0.4, 0.6, 0.8, 1]
. But then why not p = [0, 0.2, 0.4, 0.6, 0.8]
? Matlab's algorithm seems to just take a linear average of the two methods.
Note that R defines nine different algorithms for quantiles, so your assumptions need to be stated clearly.
Upvotes: 1