max_max_mir
max_max_mir

Reputation: 1736

Binning in Numpy

I have an array A which I am trying to put into 10 bins. Here is what I've done.

A = range(1,94)
hist = np.histogram(A, bins=10)
np.digitize(A, hist[1])

But the output has 11 bins, not 10, with the last value (93) placed in bin 11, when it should have been in bin 10. I can fix it with a hack, but what's the most elegant way of doing this? How do I tell digitize that the last bin in hist[1] is inclusive on the right - [ ] instead of [ )?

Upvotes: 4

Views: 3854

Answers (1)

user6655984
user6655984

Reputation:

The output of np.histogram actually has 10 bins; the last (right-most) bin includes the greatest element because its right edge is inclusive (unlike for other bins).

The np.digitize method doesn't make such an exception (since its purpose is different) so the largest element(s) of the list get placed into an extra bin. To get the bin assignments that are consistent with histogram, just clamp the output of digitize by the number of bins, using fmin.

A = range(1,94)
bin_count = 10
hist = np.histogram(A, bins=bin_count)
np.fmin(np.digitize(A, hist[1]), bin_count)

Output:

array([ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  2,  2,
        2,  2,  3,  3,  3,  3,  3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  4,
        4,  4,  4,  5,  5,  5,  5,  5,  5,  5,  5,  5,  6,  6,  6,  6,  6,
        6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  7,  7,  7,  7,  8,  8,  8,
        8,  8,  8,  8,  8,  8,  9,  9,  9,  9,  9,  9,  9,  9,  9, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10])

Upvotes: 4

Related Questions