How to exactly match the result of cumulative distribution function and quantile function?

Question

As we know, quantile function is the inverse cumulative distribution function.

Then for an existed distribute(a vector), how to exactly match the result of cumulative distribution function and quantile function?

Here is an example given in MATLAB.

a = [150   154   151   153   124]
[x_count, x_val] = hist(a, unique(a));
% compute the probability cumulative distribution 
p = cumsum(n)/sum(n);
x_out = quantile(a, p)

In the cumulative distribution function, the corresponding relation between cumulative probability and x value should be:

x = 124   150   151   153   154
p = 0.2000    0.4000    0.6000    0.8000    1.0000

But use p and quantile to compute x_out, the result is different with x:

x_out =

  137.0000  150.5000  152.0000  153.5000  154.0000

Reference

Dan · Accepted Answer

From the docs:

For a data vector of five elements such as {6, 3, 2, 10, 1}, the sorted elements {1, 2, 3, 6, 10} respectively correspond to the 0.1, 0.3, 0.5, 0.7, 0.9 quantiles.

So if you wanted to get the exact numbers out that you put in for x, and your x has 5 elements then your p needs to be p = [0.1, 0.3, 0.5, 0.7, 0.9]. The complete algorithm is explicitly defined in the documentation.

You have assumed that to get x back, p should have been [0.2, 0.4, 0.6, 0.8, 1]. But then why not p = [0, 0.2, 0.4, 0.6, 0.8]? Matlab's algorithm seems to just take a linear average of the two methods.

Note that R defines nine different algorithms for quantiles, so your assumptions need to be stated clearly.

How to exactly match the result of cumulative distribution function and quantile function?

Answers (1)

Related Questions