user2261062
user2261062

Reputation:

Points outside histogram range are discarted from plot

I am plotting a histogram of values and I want all histograms to have the same range of values for the bins, so plots can be compared. To do so, I specify a vector x with the values and range of each bin.

data = np.array([0.1, 0.1, 0.2, 0.2, 0.2, 0.32])
x = np.linspace(0, 0.2, 9)
plt.hist(data, x)

enter image description here

What I notice is that if I specify the range of x to be between 0 and 0.2 then values larger than 0.2 (0.32 in the example) are discarted from the plot.

Is there a way of accumulating all values greater than 0.2 in the last bin and all values lower than 0.0 in the first bin?

Of course I can do something like

data[data>0.2] = 0.2
data[data<0.0] = 0.0

But I'd prefer not to modify my original array and not have to make a copy of it unless there isn't another way.

Upvotes: 0

Views: 1182

Answers (1)

James
James

Reputation: 36608

You can pass the bins argument as an array with demarcation wherever you want. It does not have to be linearly spaced. This will make the bars of different widths though. For your particular case, you can use the .clip method of the data array.

data = np.array([0.1, 0.1, 0.2, 0.2, 0.2, 0.32])
x = np.linspace(0, 0.2, 9)
plt.hist(data.clip(min=0, max=0.2), x)

Upvotes: 2

Related Questions