drjrm3
drjrm3

Reputation: 4718

Is there a simple histogram function?

For data diffs07 and diffs14, in Matlab I can get the x and y coordinates of the binned data by simply using:

[ys07, xs07] = hist(-log10(diffs07), 250);
[ys14, xs14] = hist(-log10(diffs14), 250);

In Python, however, I can't find a straightforward way so I am using:

xs_diffs = np.linspace(0, 17, 250)
dx = xs_diffs[1]-xs_diffs[2]
ys07 = []
ys14 = []
for x in xs_diffs:
    ys07.append( len( [ ty for ty in diffs07 if (-np.log10(ty)-dx/2.0 < x and -np.log10(ty) < x+dx/2.0) ] ) ) 
    ys14.append( len( [ ty for ty in diffs14 if (-np.log10(ty)-dx/2.0 < x and -np.log10(ty) < x+dx/2.0) ] ) ) 

plt.plot(xs_diffs, ys07, 'r.', xs_diffs, ys14, 'b.')
plt.show()

But this takes quite a long time compared to the Matlab code. Is there a straightforward (and faster!) way to do this in Python?

Upvotes: 0

Views: 570

Answers (1)

David Wilkinson
David Wilkinson

Reputation: 151

hist, bin_edges = np.histogram(diffs07,bins=250)

for automatic selection of bins (the Matlab equivalent of your first statement). Alternatively,

hist, bin_edges = np.histogram(diffs07,bins=np.linspace(0,17,250))

should be used if you want to specify the range of the bins used as per your initial attempt.

Matplotlib also has the very helpful plt.hist for immediate plotting, again with the option of automatic or set bins.

Upvotes: 5

Related Questions