Reputation: 1768
For a list of sorted numpy arrays of unequal lengths (say M0
, M1
, M2
) I want to find how many elements of each of these arrays is inside number ranges which are given by a adjoining pairs of an array (say zbin
. zbin
is not sorted and the said number ranges are like the following [z[0], z[1]]
, [z[2], z[3]]
, [z[4], z[5]]
and so on. zbin
always has an even number of elements. ) The unsorted nature of zbin
and the consideration of adjoining pairs for in zbin
for finding the number ranges makes this question different from the one asked here Number of elements of numpy arrays inside specific bins . In the said link, zarr
was sorted and adjoining elements gave number ranges (here adjoining pairs give number ranges).
This is what I am doing presently:
""" Function to do search query """
def search(numrange, lst):
arr = np.zeros(len(lst))
for i in range(len(lst)):
probe = lst[i]
count = 0
for j in range(len(probe)):
if (probe[j]>numrange[1]): break
if (probe[j]>=numrange[0]) and (probe[j]<=numrange[1]): count = count + 1
arr[i] = count
return arr
""" Some example of sorted one-dimensional arrays of unequal lengths """
M0 = np.array([5.1, 5.4, 6.4, 6.8, 7.9])
M1 = np.array([5.2, 5.7, 8.8, 8.9, 9.1, 9.2])
M2 = np.array([6.1, 6.2, 6.5, 7.2])
""" Implementation and output """
lst = [M0, M1, M2]
zbin = np.array([5.0, 5.2, 5.1, 5.3, 5.2, 5.4])
zarr = np.zeros( (len(zbin)/2, len(lst)) )
for i in np.arange(0, len(zbin)/2, 1):
indx = i*2
print indx
numrange = [zbin[indx], zbin[indx+1]]
zarr[i,:] = search(numrange, lst)
print zarr
The output is:
[[ 1. 1. 0.]
[ 1. 1. 0.]
[ 1. 1. 0.]]
Here, the first row of zarr
([1,1,0]
shows that M0
has 1 element in the considered number range [5.0, 5.2]
, M1
has 1 element and M2
has 0 elements. The second and the third rows show results for the subsequent number ranges, i.e. [5.1, 5.3]
and [5.2, 5.4]
.)
I want to know what is the fastest way to achieve this desired functionality (zarr
). In my actual task, I will be dealing with zbin
of much bigger size, and many more arrays (M
). I will very much appreciate any help.
Upvotes: 1
Views: 185
Reputation: 5294
Not sure numpy would really get you any speed up, but here's an attempt:
lst = [M0, M1, M2]
zbin = np.array([5.0, 5.2, 5.1, 5.3, 5.2, 5.4])
zarr = np.zeros((len(zbin)//2, len(lst)), dtype=np.float)
for i,M in enumerate(lst):
zarr[:,i] = np.count_nonzero(np.logical_and(M >= zbin[::2, np.newaxis],
M <= zbin[1::2, np.newaxis]), axis=1)
In [10]: zarr
Out[10]:
array([[1., 1., 0.],
[1., 1., 0.],
[1., 1., 0.]])
By the way, if you can exploit the sorted nature of the arrays, @Divakar solution from the linked question should definitely be faster.
Upvotes: 1