Reputation: 95
I am playing around with numba
to accelerate my code. I notice that the performance varies significantly when using np.inf
instead np.nan
inside the function. Below I have attached three sample functions for illustration.
function1
is not accelerated by numba
.function2
and function3
are both accelerated by numba
, but one uses np.nan
while the other uses np.inf
.On my machine, the average runtime of the three functions are 0.032284s
, 0.041548s
and 0.019712s
respectively. It appears that using np.nan
is much slower than np.inf
. Why does the performance vary significantly? Thanks in advance.
Edit: I am using Python 3.7.11
and Numba 0.55.Orc1
.
import numpy as np
import numba as nb
def function1(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr, nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in range(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function2(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function3(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.inf
output2[:] = np.inf
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.inf)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
array1 = 10*np.random.random((1000,1000))
array2 = 10*np.random.random((1000,1000))
output1 = function1(array1, array2)
output2 = function2(array1, array2)
output3 = function3(array1, array2)
Upvotes: 1
Views: 853
Reputation: 50668
The second one is much slower because output1 != np.nan
returns a copy output1
since np.nan != np.nan
is True
(like any other value -- v != np.nan
is always true). Thus, the resulting array to compute are much bigger causing a slower execution.
The point is you must never compare a value to np.nan
using comparison operators: use np.isnan(value)
instead. In your case, you should use np.logical_not(np.isnan(output1))
.
The second implementation may be slightly slower due to the temporary array created by np.logical_not
(I did not see any statistically significant difference on my machine between using NaN or Inf once the code has been corrected).
Upvotes: 4