Reputation: 1335
I need some help with calculating cumulative distribution.
lets say I have data like that:
data = abs(randn(1000,1));
I have to calculate probability cumulative distribution and bin it to reduce amount of points. I am doing it like that (lets take bin = 50):
[n, x] = hist(data, 50);
y = cumsum(n);
y = y./max(y);
The problem is, that now I have a lot of points close to y=1, but only few close to zero. I'd like to have kind of equal distribution distribution of points (additional binning on y axis?). I hope you know what I mean :) How I can do that? Thanks!
Upvotes: 1
Views: 1134
Reputation: 19870
So, it actually means that in your data
vector many points are close to 0. The usual procedure is to transform the data using log: log2 or log10, depending on the nature of the data.
Try
[n, x] = hist(log10(data), 50);
y = cumsum(n);
y = y./max(y);
You can also try sqrt
instead of log
or other functions.
UPDATE
Reviewing the question after your comment I think you want to use something like this:
bin = 10.^(linspace(log10(min(data)),log10(max(data)),50));
[n, x] = hist(data, bin);
y = cumsum(n);
y = y./max(y);
plot(bin,y,'.')
Upvotes: 2