Gohann
Gohann

Reputation: 155

How to get cumulative distribution functions of a vector in Matlab using cumsum?

I want to get the probability to get a value X higher than x_i, which means the cumulative distribution functions CDF. P(X>=x_i). I've tried to do it in Matlab with this code.

Let's assume the data is in the column vector p1.

   xp1 = linspace(min(p1), max(p1));   %range of bins  
   histp1 = histc(p1(:), xp1);      %histogram od data 
   probp1 = histp1/sum(histp1);     %PDF (probability distribution function)  
   `figure;plot(probp1, 'o')  `   

Now I want to calculate the CDF,

   sorncount = flipud(histp1);  
   cumsump1 = cumsum(sorncount);  
   normcumsump1 = cumsump1/max(cumsump1);  
   cdf = flipud(normcumsump1);  
   figure;plot(xp1, cdf, 'ok');  

I'm wondering whether anyone can help me to know if I'm ok or am I doing something wrong?

Upvotes: 2

Views: 2409

Answers (1)

user3717023
user3717023

Reputation:

Your code works correctly, but is a bit more complicated than it could be. Since probp1 has been normalized to have sum equal to 1, the maximum of its cumulative sum is guaranteed to be 1, so there is no need to divide by this maximum. This shortens the code a bit:

xp1 = linspace(min(p1), max(p1));   %range of bins  
histp1 = histc(p1(:), xp1);         %count for each bin
probp1 = histp1/sum(histp1);        %PDF (probability distribution function)  
cdf = flipud(cumsum(flipud(histp1)));   %CDF (unconventional, of P(X>=a) kind)

As Raab70 noted, most of the time CDF is understood as P(X<=a), in which case you don't need flipud: taking cumsum(histp1) is all that's needed.

Also, I would probably use histp1(end:-1:1) instead of flipud(histp1), so that the vector is flipped no matter if it's a row or column.

Upvotes: 1

Related Questions