How can I use numpy for an efficient array manipulation?

Question

Suppose I have an array with with the following shape:

array([[137, 138, 139, ..., 126, 126, 125],
       [139, 139, 139, ..., 126, 126, 126],
       [140, 140, 140, ..., 127, 126, 126],
       ...,
       [132, 135, 137, ..., 136, 135, 133],
       [133, 134, 136, ..., 136, 134, 133],
       [133, 134, 134, ..., 136, 134, 133]], dtype=uint8)

My goal is to manipulate the array such that I can transform it to a binary array, where a threshold (e.g. 100) determines whether the entry is 0 or 1.

For now I just implemented nested for-loops checking the row and columns respectivley.

thres = 100

for k in range(0, len(my_array)):
    for i in range(0, len(my_array[0])):
        if(my_array[k][i] < thres):
            my_array[k][i] = 1
        else:
            my_array[k][i] = 0

It is mentioned that I am not tied to any perfromance goals. But just out of curiosity: How can I make use of numpy efficiently such that I can achieve a better performance? Since the provided array can be computed in a reasonable amount of time, the above mentioned approach works just fine. I can image that with increase of the array size the performance of the approach will become quite insufficient.

kabanus · Accepted Answer

How about

my_array[my_array < thresh] = 1
my_array[my_array > 1] = 0

or even better if you don't mind true false

my_array <= thresh

or

my_array = (my_array < thresh).astype('uint8')

if you do.

If you have to loop on an array, do not use it. They are (much) worse than Python lists for that - try it on your example and see how much faster it is. Use a list if you loop.

How can I use numpy for an efficient array manipulation?

Answers (1)

Related Questions