hagbard7000
hagbard7000

Reputation: 21

Why is the use of numpy.nonzero enormously faster than looping through a numpy array?

Let's create a numpy ndarray of 10 million bools with all values initialized to True

n=10000000
sample = np.ones(n, dtype=bool)

Next we'll set a few values to False

sample[1] = sample[5] = sample[12] = sample[25] = sample[50] = False

The number of True values is now n-5 = 9999995

We can count the number of true values by either looping through the array, or using np.nonzero

The first method takes about 30 seconds on my MacBook as seen from

    !date
    sum=0
    for i in range(n):  
       if sample[i] == True:
          sum=sum+1
    print(sum)
    !date

Thu Dec 20 01:31:34 EST 2018
9999995
Thu Dec 20 01:32:02 EST 2018

Whereas the second method takes less than a second

!date
print(len(np.zero(sample)[0]))
!date

Thu Dec 20 01:33:05 EST 2018
9999995
Thu Dec 20 01:33:05 EST 2018

When the array is 1 billion bools, is again less than a second, whereas the loop takes about a half hour.

Why the enormous difference? Is the numpy.nonzero method somehow maintaining some metadata that len is able to access?

Upvotes: 2

Views: 1558

Answers (1)

Amadan
Amadan

Reputation: 198324

Your first sample has a Python loop, and in each iteration a Python boolean object (28 bytes) needs to be constructed from the low-level boolean (1 byte) in the numpy array, and then (unless Python is smarter than I think it is) another Boolean is constructed to hold the result of the useless comparison of a Boolean to True (useless because if x: is always the same as if x == True: given x is a Boolean). There's also calculation going on regarding your counter, i.

Your second sample happens almost entirely in native code, not in Python. The implicit loop is coded in a low-level language, incrementing its counter is a single machine-code instruction, data access is as direct as could be imagined. You bob into Python only several times: once to wrap the newly constructed array in a Python object at the end of np.zero, once to construct the Python integer for the result of len.

That is the only difference: Python vs native.

Upvotes: 4

Related Questions