Commoner
Commoner

Reputation: 1768

How many elements of numpy array in specified number range

For a list of sorted numpy arrays of unequal lengths (say M0, M1, M2) I want to find how many elements of each of these arrays is inside number ranges which are given by a adjoining pairs of an array (say zbin. zbin is not sorted and the said number ranges are like the following [z[0], z[1]], [z[2], z[3]], [z[4], z[5]] and so on. zbin always has an even number of elements. ) The unsorted nature of zbin and the consideration of adjoining pairs for in zbin for finding the number ranges makes this question different from the one asked here Number of elements of numpy arrays inside specific bins . In the said link, zarr was sorted and adjoining elements gave number ranges (here adjoining pairs give number ranges).

This is what I am doing presently:

""" Function to do search query """
def search(numrange, lst):
    arr = np.zeros(len(lst))        
    for i in range(len(lst)):
        probe = lst[i]
        count = 0
        for j in range(len(probe)):
            if (probe[j]>numrange[1]): break
            if (probe[j]>=numrange[0]) and (probe[j]<=numrange[1]): count = count + 1   

        arr[i] = count
    return arr


""" Some example of sorted one-dimensional arrays of unequal lengths """
M0 = np.array([5.1, 5.4, 6.4, 6.8, 7.9])
M1 = np.array([5.2, 5.7, 8.8, 8.9, 9.1, 9.2])
M2 = np.array([6.1, 6.2, 6.5, 7.2])

""" Implementation and output """
lst = [M0, M1, M2]
zbin = np.array([5.0, 5.2, 5.1, 5.3, 5.2, 5.4])

zarr = np.zeros( (len(zbin)/2, len(lst)) )
for i in np.arange(0, len(zbin)/2, 1):
    indx = i*2
    print indx
    numrange = [zbin[indx], zbin[indx+1]]
    zarr[i,:] = search(numrange, lst)

print zarr  

The output is:

[[ 1.  1.  0.]
 [ 1.  1.  0.]
 [ 1.  1.  0.]]

Here, the first row of zarr ([1,1,0] shows that M0 has 1 element in the considered number range [5.0, 5.2], M1 has 1 element and M2 has 0 elements. The second and the third rows show results for the subsequent number ranges, i.e. [5.1, 5.3] and [5.2, 5.4].)

I want to know what is the fastest way to achieve this desired functionality (zarr). In my actual task, I will be dealing with zbin of much bigger size, and many more arrays (M). I will very much appreciate any help.

Upvotes: 1

Views: 185

Answers (1)

filippo
filippo

Reputation: 5294

Not sure numpy would really get you any speed up, but here's an attempt:

lst = [M0, M1, M2]
zbin = np.array([5.0, 5.2, 5.1, 5.3, 5.2, 5.4])

zarr = np.zeros((len(zbin)//2, len(lst)), dtype=np.float)

for i,M in enumerate(lst):
    zarr[:,i] = np.count_nonzero(np.logical_and(M >= zbin[::2, np.newaxis],
                                                M <= zbin[1::2, np.newaxis]), axis=1)

In [10]: zarr
Out[10]: 
array([[1., 1., 0.],
       [1., 1., 0.],
       [1., 1., 0.]])

By the way, if you can exploit the sorted nature of the arrays, @Divakar solution from the linked question should definitely be faster.

Upvotes: 1

Related Questions