duckvader
duckvader

Reputation: 71

Unequal width binned histogram in python

I have an array with probability values stored in it. Some values are 0. I need to plot a histogram such that there are equal number of elements in each bin. I tried using matplotlibs hist function but that lets me decide number of bins. How do I go about plotting this?(Normal plot and hist work but its not what is needed)

I have 10000 entries. Only 200 have values greater than 0 and lie between 0.0005 and 0.2. This distribution isnt even as 0.2 only one element has whereas 2000 approx have value 0.0005. So plotting it was an issue as the bins had to be of unequal width with equal number of elements

Upvotes: 0

Views: 2309

Answers (1)

sascha
sascha

Reputation: 33522

The task does not make much sense to me, but the following code does, what i understood as the thing to do.

I also think the last lines of the code are what you really wanted to do. Using different bin-widths to improve visualization (but don't target the distribution of equal amount of samples within each bin)! I used astroml's hist with method='blocks' (astropy supports this too)

Code

# Python 3 -> beware the // operator!

import numpy as np
import matplotlib.pyplot as plt
from astroML import plotting as amlp

N_VALUES = 1000
N_BINS = 100

# Create fake data
prob_array = np.random.randn(N_VALUES)
prob_array /= np.max(np.abs(prob_array),axis=0)  # scale a bit

# Sort array
prob_array = np.sort(prob_array)

# Calculate bin-borders,
bin_borders = [np.amin(prob_array)] + [prob_array[(N_VALUES // N_BINS) * i] for i in range(1, N_BINS)] + [np.amax(prob_array)]

print('SAMPLES: ', prob_array)
print('BIN-BORDERS: ', bin_borders)

# Plot hist
counts, x, y = plt.hist(prob_array, bins=bin_borders)
plt.xlim(bin_borders[0], bin_borders[-1] + 1e-2)
print('COUNTS: ', counts)
plt.show()


# And this is, what i think, what you really want

fig, (ax1, ax2) = plt.subplots(2)
left_blob = np.random.randn(N_VALUES/10) + 3
right_blob = np.random.randn(N_VALUES) + 110
both = np.hstack((left_blob, right_blob))  # data is hard to visualize with equal bin-widths

ax1.hist(both)
amlp.hist(both, bins='blocks', ax=ax2)
plt.show()

Output

enter image description here enter image description here

Upvotes: 2

Related Questions