LaioZatt
LaioZatt

Reputation: 23

How to faster iterate over a Python numpy.ndarray with 2 dimensions

So, i simply want to make this faster:

for x in range(matrix.shape[0]):
        for y in range(matrix.shape[1]):
            if matrix[x][y] == 2 or matrix[x][y] == 3 or matrix[x][y] == 4 or matrix[x][y] == 5 or matrix[x][y] == 6:
                if x not in heights:
                    heights.append(x)

Simply iterate over a 2x2 matrix (usually round 18x18 or 22x22) and check it's x. But its kinda slow, i wonder which is the fastest way to do this.

Thank you very much!

Upvotes: 0

Views: 143

Answers (2)

yatu
yatu

Reputation: 88236

For a numpy based approach, you can do:

np.flatnonzero(((a>=2) & (a<=6)).any(1))
# array([1, 2, 6], dtype=int64)

Where:

a = np.random.randint(0,30,(7,7))

print(a)

array([[25, 27, 28, 21, 18,  7, 26],
       [ 2, 18, 21, 13, 27, 26,  2],
       [23, 27, 18,  7,  4,  6, 13],
       [25, 20, 19, 15,  8, 22,  0],
       [27, 23, 18, 22, 25, 17, 15],
       [19, 12, 12,  9, 29, 23, 21],
       [16, 27, 22, 23,  8,  3, 11]])

Timings on a larger array:

a = np.random.randint(0,30, (1000,1000))

%%timeit
heights=[]
for x in range(a.shape[0]):
        for y in range(a.shape[1]):
            if a[x][y] == 2 or a[x][y] == 3 or a[x][y] == 4 or a[x][y] == 5 or a[x][y] == 6:
                if x not in heights:
                    heights.append(x)
# 3.17 s ± 59.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
yatu = np.flatnonzero(((a>=2) & (a<=6)).any(1))
# 965 µs ± 11.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

np.allclose(yatu, heights)
# true

Vectorizing with numpy yields to roughly a 3200x speedup

Upvotes: 1

AKX
AKX

Reputation: 169032

It looks like you want to find if 2, 3, 4, 5 or 6 appear in the matrix.

You can use np.isin() to create a matrix of true/false values, then use that as an indexer:

>>> arr = np.array([1,2,3,4,4,0]).reshape(2,3)
>>> arr[np.isin(arr, [2,3,4,5,6])]
array([2, 3, 4, 4])

Optionally, turn that into a plain Python set() for faster in lookups and no duplicates.

To get the positions in the array where those numbers appear, use argwhere:

>>> np.argwhere(np.isin(arr, [2,3,4,5,6]))
array([[0, 1],
       [0, 2],
       [1, 0],
       [1, 1]])

Upvotes: 0

Related Questions