Reputation: 21
Let's create a numpy ndarray of 10 million bools with all values initialized to True
n=10000000
sample = np.ones(n, dtype=bool)
Next we'll set a few values to False
sample[1] = sample[5] = sample[12] = sample[25] = sample[50] = False
The number of True values is now n-5 = 9999995
We can count the number of true values by either looping through the array, or using np.nonzero
The first method takes about 30 seconds on my MacBook as seen from
!date
sum=0
for i in range(n):
if sample[i] == True:
sum=sum+1
print(sum)
!date
Thu Dec 20 01:31:34 EST 2018
9999995
Thu Dec 20 01:32:02 EST 2018
Whereas the second method takes less than a second
!date
print(len(np.zero(sample)[0]))
!date
Thu Dec 20 01:33:05 EST 2018
9999995
Thu Dec 20 01:33:05 EST 2018
When the array is 1 billion bools, is again less than a second, whereas the loop takes about a half hour.
Why the enormous difference? Is the numpy.nonzero method somehow maintaining some metadata that len is able to access?
Upvotes: 2
Views: 1558
Reputation: 198324
Your first sample has a Python loop, and in each iteration a Python boolean object (28 bytes) needs to be constructed from the low-level boolean (1 byte) in the numpy array, and then (unless Python is smarter than I think it is) another Boolean is constructed to hold the result of the useless comparison of a Boolean to True
(useless because if x:
is always the same as if x == True:
given x
is a Boolean). There's also calculation going on regarding your counter, i
.
Your second sample happens almost entirely in native code, not in Python. The implicit loop is coded in a low-level language, incrementing its counter is a single machine-code instruction, data access is as direct as could be imagined. You bob into Python only several times: once to wrap the newly constructed array in a Python object at the end of np.zero
, once to construct the Python integer for the result of len
.
That is the only difference: Python vs native.
Upvotes: 4