Reputation: 631
To improve the performance of loops, I used Numba vectorize method.
s1 = pd.Series([1,3,5,6,8,10,1,1,1,1,1,1])
s2 = pd.Series([4,5,6,8,10,1,7,1,6,5,4,3])
ding=pd.DataFrame({'A':s1,'B':s2})
@numba.vectorize(['float64(int16,int16)'])
def sumd(a,b):
if a==1:
return (a+b)
else:
return 0
ding['sum']=sumd(ding.A,ding.B)
Now I want to return an additional variable that is product of cols A and B. i.e. My aim is to return two variables from a function using vectorize method. I am not sure how to initialize the numba.vectorize method. Please help me. I am open to listen to any other ways to improve the efficiency of the method as well.
One alternative approach I tried is the following, but this appeared a bit complicated to me. I am looking for easier ways to optimize the function. Thanks in advance.
s1 = pd.Series([1,3,5,6,8,10,1,1,1,1,1,1])
s2 = pd.Series([4,5,6,8,10,1,7,1,6,5,4,3])
ding=pd.DataFrame({'A':s1,'B':s2})
@numba.vectorize(['float64(int16,int16)'])
def sumd(a,b):
if a==1:
sumarr.append((a+b))
prodarr.append(a*b)
return 1
else:
sumarr.append(0)
prodarr.append(0)
return 1
sumarr=[]
prodarr=[]
sumd(ding.A,ding.B)
ding['sum']=sumarr
ding['prod']=prodarr
Upvotes: 2
Views: 1072
Reputation: 1
You could try: 1. add an extra variable which should choose between sum and product and basically run your code 2 times, which is helpful for parallel & cuda target
@numba.vectorize(['float64(int16,int16,int16)'])
if retopt ==1:
return (a+b)
if retopt ==2:
return (a*b)
mask you sum and product in the return value e.g. if you know max(abs(s1,s2)) = 37 kbypass = next magnitude (37) = 100
return = kbypass * product + sum
then do smth like
product, sum= divmod(out, kBypass)
Upvotes: 0
Reputation: 68682
You can't return multiple values from vectorize
and using global lists is not going to work. I would just use a standard jit function instead:
@nb.jit(nopython=True)
def sumd(a, b):
sumx = np.zeros_like(a, dtype=np.float64)
prodx = np.zeros_like(a, dtype=np.float64)
for i in range(a.shape[0]):
if a[i] == 1:
sumx[i] = a[i] + b[i]
prodx[i] = a[i] * b[i]
return sumx, prodx
sumx, prodx = sumd(ding.A.values, ding.B.values)
ding['sum'] = sumx
ding['prod'] = prodx
Note, I'm passing in the values
of each column so that I can use numba in nopython
mode since this is always more efficient.
Upvotes: 4