Litwos
Litwos

Reputation: 1338

Create array by conditional masking

I have multiple arrays of thousands of elements. I need to open all of them and create an output array using multiple conditions. Using the instructions from this question (Iterating over a numpy array), I managed to create a solution, but it runs very slowly on my large arrays.

The code is this, run on a test sample with only two arrays (i can have more than two):

import numpy as np
from random import randint
import random


def declare_arrays():
    random.seed(1)
    w, h = 10, 10
    mat1 = np.array([[randint(0, 100) for x in range(w)] for y in range(h)])
    print (mat1, type(mat1))

    random.seed(2)
    mat2 = np.array([[randint(0, 100) for i in range(w)] for j in range(h)])
    print (mat2, type(mat2))

    return mat1, mat2


if __name__=="__main__":
    arr1, arr2 = declare_arrays()

    arr_out = np.zeros(arr1.shape)

    for (i, j), val1 in np.ndenumerate(arr1):
        val2 = arr2[i, j]

        if (val1 > val2) and (val2 > 5):
            arr_out[i, j] = val2
        else:
            arr_out[i, j] = val1
    print("Arr out: ", arr_out)

This gives the result:

Arr out:
 [[   7.   11.   10.    8.   21.   15.   63.   39.   32.   60.]
 [  27.   48.  100.   26.   12.   20.    3.   49.   50.   77.]
 [  65.   47.    0.   56.   57.   34.   92.   29.   46.   13.]
 [  40.    3.    2.    3.   21.   69.    1.   30.   29.   27.]
 [  22.   41.    3.   17.   28.   65.   46.   63.   70.   29.]
 [  23.   29.   53.   28.   67.   58.   37.    2.   45.   46.]
 [  57.   12.   23.   51.   91.   37.   15.   83.   42.   31.]
 [  62.   35.   54.   64.   65.   24.   38.   36.   59.   44.]
 [  64.   50.   71.    4.   58.   31.   84.   28.   41.   85.]
 [  21.   46.   34.   89.   61.   39.   38.   47.   11.   56.]]

But this was run on a 10x10 array. If I run it on a 10000x10000 array, it takes enormous time. Is there a way to make this faster? Thanks for any help!

Upvotes: 0

Views: 71

Answers (1)

Divakar
Divakar

Reputation: 221514

Use np.where to replace those element-wise operations -

arr_out = np.where( (arr1 > arr2) & (arr2 > 5), arr2, arr1)

Upvotes: 2

Related Questions