Zack Eriksen
Zack Eriksen

Reputation: 301

Efficient way to check which array rows fall between the values of two other vectors

I have a large numpy array of shape (20, 702000), and I need to find which rows have values that fall between two other arrays. In the following simplified example,

a = np.array([[1,3,4,3,2,4],
              [4,6,2,5,8,6],
              [1,4,8,3,1,9],
              [9,4,2,5,6,5]])

range_low = np.array([0,2,3,2,1,3])
range_high = np.array([2,4,5,4,3,5])

a[0] is the only row that would fit the criteria and fall between range_low and range_high for each number.

I have tried something like,

[np.where(np.logical_and(x > range_low, x < range_high)) for x in a]

and it works, but in such a large array looping takes a very long time. Is there a quicker way to do something like this? Thanks!

Upvotes: 0

Views: 41

Answers (2)

Skarfie123
Skarfie123

Reputation: 3

Quicker:

[np.where(x) for x in np.logical_and(a > range_low, a < range_high)]

Even quicker, but different format output:

np.where(np.logical_and(a > range_low, a < range_high))

Times:

print(timeit.timeit(lambda: [np.where(np.logical_and(x > range_low, x < range_high)) for x in a], number=100000)) # 1.3141384999999999
print(timeit.timeit(lambda: [np.where(x) for x in np.logical_and(a > range_low, a < range_high)], number=100000)) # 1.0557419000000001
print(timeit.timeit(lambda: np.where(np.logical_and(a > range_low, a < range_high)), number=100000)) # 0.4683885000000001

Upvotes: 0

Kevin
Kevin

Reputation: 3358

If i understand the question correctly, you want all row indices that fit the critera? Then i would suggest this:

np.flatnonzero(((a > range_low) & (a < range_high)).all(axis=1))

If you have numpy version 1.20.0 you can also use the where keyword for numpy.all.

Upvotes: 1

Related Questions