user248237
user248237

Reputation:

placing numbers into bins with numpy

I am binning an array into a set of bins using np.digitize:

data = np.array([1,5,6,15,25,60])
bins = np.array([ 5, 10, 20, 50])
result = np.digitize(data, bins)
# this fails
print bins[result]

I want the data to be placed into bins with the interpretation that each value in the bin is interpreted as "less than or equal to" except last bin, which all other values fit into. Is there a function that does this? in this case it would be: "x <= 5, 5 < x <= 10, 10 < x <= 20, and 20 < x <= 50 including x > 50". What's the concise way to do this in numpy?

Upvotes: 2

Views: 2725

Answers (1)

rtrwalker
rtrwalker

Reputation: 1021

When you say 20 < x <= 50 including x > 50 for your last bin you are really saying x>20. You can get x>20 by dropping your last bin of 50. np.digitize takes a parameter right which will when True allow you to have bin behaviour like 10 < x <= 20 rather than the default 10 <= x < 20

>>> data = np.array([1,5,6,15,25,60])
>>> bins = np.array([ 5, 10, 20])
>>> np.digitize(data, bins, right=True)
array([0, 0, 1, 2, 3, 3])
>>> 

your code bins[result] fails because though bins is defined with 3 values there are actually 4 intervals (x<=5, 5<x<=10, 10<x<=20, 20<x). So for example 65 will be placed in bin with index 3 ie. the 4th interval. The 4th value of bins does not exist, hence your error.

Upvotes: 4

Related Questions