Reputation: 699
I write a function to test numba.guvectorize. This function takes product of two numpy arrays and compute the sum after first axis, as following:
from numba import guvectorize, float64
import numpy as np
@guvectorize([(float64[:], float64[:], float64)], '(n),(n)->()')
def g(x, y, res):
res = np.sum(x * y)
However, the above guvectorize function returns wrong results as shown below:
>>> a = np.random.randn(3,4)
>>> b = np.random.randn(3,4)
>>> np.sum(a * b, axis=1)
array([-0.83053829, -0.15221319, -2.27825015])
>>> g(a, b)
array([4.67406747e-310, 0.00000000e+000, 1.58101007e-322])
What might be causing this problem?
Upvotes: 1
Views: 2324
Reputation: 3437
Function g()
receives an uninitialized array through the res
parameter. Assigning a new value to it doesn't modify the original array passed to the function.
You need to replace the contents of res
(and declare it as an array):
@guvectorize([(float64[:], float64[:], float64[:])], '(n),(n)->()')
def g(x, y, res):
res[:] = np.sum(x * y)
The function operates on 1D vectors and returns a scalar (thus the signature (n),(n)->()
) and guvectorize does the job of dealing with 2D inputs and returning a 1D output.
>>> a = np.random.randn(3,4)
>>> b = np.random.randn(3,4)
>>> np.sum(a * b, axis=1)
array([-3.1756397 , 5.72632531, 0.45359806])
>>> g(a, b)
array([-3.1756397 , 5.72632531, 0.45359806])
But the original Numpy function np.sum
is already vectorized and compiled, so there is little speed gain in using guvectorize in this specific case.
Upvotes: 3
Reputation: 16747
Your a
and b
arrays are 2-dimensional, while your guvectorized function has signature of accepting 1D arrays and returning 0D scalar. You have to modify it to accept 2D and return 1D.
In one case you do np.sum
with axis = 1
in another case without it, you have to do same thing in both cases.
Also instead of res = ...
use res[...] = ...
. Maybe it is not the problem in case of guvectorize but it can be a general problem in Numpy code because you have to assign values instead of variable reference.
In my case I added cache = True
param to guvectorize decorator, it only speeds up running through caching/re-using Numba compiled code and not re-compiling it on every run. It just speeds up things.
Full modified corrected code see below:
from numba import guvectorize, float64
import numpy as np
@guvectorize([(float64[:, :], float64[:, :], float64[:])], '(n, m),(n, m)->(n)', cache = True)
def g(x, y, res):
res[...] = np.sum(x * y, axis = 1)
# Test
np.random.seed(0)
a = np.random.randn(3, 4)
b = np.random.randn(3, 4)
print(np.sum(a * b, axis = 1))
print(g(a, b))
Output:
[ 2.57335386 3.41749149 -0.42290296]
[ 2.57335386 3.41749149 -0.42290296]
Upvotes: -1