BaCh
BaCh

Reputation: 645

numpy arrays: fast element wise compare and set?

Is there a function which allows me to quickly compare and set values in a numpy array against a fixed value?

E.g., assume I have an array with numerical values like this:

0 0 0 3 7 3 0 0 0

I'd like to say: from index position [3 to index position [7, set the value to 5 if it is lower than 5. The result would be this:

0 0 0 5 7 5 5 0 0

The reason I'm asking is because when doing this operation "by hand" in a loop, things seem to be superslow. E.g., the following code takes ~90s to perform 1 million times such an operation on 64 consecutive elements in a 1 million element array:

import numpy as np
import random

tsize = 1000000
arr = np.zeros(tsize, dtype=np.uint32)

for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    for kpos in range(apos, apos + 64):    # loop to compare and set 64 elements
        if arr[kpos] < num:
            arr[kpos] = num

If there is not such a function: are there any obvious NumPy newbie mistakes in the code above which slow it down?

Upvotes: 0

Views: 1419

Answers (2)

akuiper
akuiper

Reputation: 214957

The for loop can be replaced with a slice and assignment, like so:

arr[apos:apos+64] = np.clip(arr[apos:apos+64], a_min=num, a_max=None)

Can also use np.maximum:

arr[apos:apos+64] = np.maximum(arr[apos:apos+64], num)

Timing:

import numpy as np
import random
​
tsize = 1000
arr = np.zeros(tsize, dtype=np.uint32)

%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    for kpos in range(apos, apos + 64):    # loop to compare and set 64 elements
        if arr[kpos] < num:
            arr[kpos] = num
# 10 loops, best of 3: 107 ms per loop

%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    arr[apos:apos+64] = np.clip(arr[apos:apos+64], a_min=num, a_max=None)
# 100 loops, best of 3: 4.14 ms per loop

%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    arr[apos:apos+64] = np.maximum(arr[apos:apos+64], num)
# 100 loops, best of 3: 4.13 ms per loop

# @Alexander's soln
%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    arr[apos:apos+64] = arr[apos:apos+64].clip(min=num)
# 100 loops, best of 3: 3.69 ms per loop

Upvotes: 2

Alexander
Alexander

Reputation: 109546

You can use clip with array indexing.

a = np.array([0, 0, 0, 3, 7, 3, 0, 0, 0])
a[3:7] = a[3:7].clip(min=5)
>>> a
array([0, 0, 0, 5, 7, 5, 5, 0, 0])

Upvotes: 2

Related Questions