Reputation: 4519
Let's say you have some vector z
and you compute [f, x] = ecdf(z);
, hence your empirical CDF can be plotted with stairs(x, f)
.
Is there a simple way to compute what all the percentile scores are for z
?
I could do something like:
z
, that is for each entry z(i)
of z
x
to find where z(i)
is. (find index j
such that x(j) = z(i))f(j)
It feels like there should be a simpler, already implemented way to do this...
Upvotes: 1
Views: 280
Reputation: 112689
Let f
be a monotone function defined at values x
, for which you want to compute the inverse function at values p
. In your case f
is monotone because it is a CDF; and the values p
define the desired quantiles. Then you can simply use interp1
to interpolate x
, considered as a function of f
, at values p
:
z = randn(1,1e5); % example data: normalized Gaussian distribution
[f, x] = ecdf(z); % compute empirical CDF
p = [0.5 0.9 0.95]; % desired values for quantiles
result = interp1(f, x, p);
In an example run of the above code, this produces
result =
0.001706069265714 1.285514249607186 1.647546848952448
For the specific case of computing quantiles p
from data z
, you can directly use quantile
and thus avoid computing the empirical CDF:
result = quantile(z, p)
The results may be slightly different depending on how the empirical CDF has been computed in the first method:
>> quantile(z, p)
ans =
0.001706803588857 1.285515826972878 1.647582486507752
For comparison, the theoretical values for the above example (Gaussian distribution) are
>> norminv(p)
ans =
0 1.281551565544601 1.644853626951472
Upvotes: 3