python - sum up list/array according to how another list list/gets assigned by np.histogram

Question

I have two lists of equal lengths a, b. I want to create a histogram from a and sum up the values of b into a new list c according to which bin the element of a gets assigned to.

a = [0.3, 1.2, 1.8, 0.5, ...]
b = [1, 1, 0, 0, ...]

hist = np.histogram(a, bins=[0.0, 0.5, 1., 1.5, 2.])

In the example above, b[0] would be added to c[0] because a[0] gets added to hist[0]; b[1] would be added to c[2] etc. What is a scalable way of doing this without using loops? (Loops are too slow for large lists.)

Quang Hoang · Accepted Answer

I recommend pandas for this purpose:

import pandas as pd
buckets = pd.cut(a, bins=[0.0,0.5,1, 1.5, 2])

result = pd.Series(b).groupby(buckets).sum()

Output:

(0.0, 0.5]    1
(0.5, 1.0]    0
(1.0, 1.5]    1
(1.5, 2.0]    0
dtype: int64

python - sum up list/array according to how another list list/gets assigned by np.histogram

Answers (1)

Related Questions