Reputation: 89
I'm trying to convert my program to Numba, but I have a problem nesting one function inside another. My methods are based on NumPy vectorize
, but I can't do the same using numba. Do you know any analogous examples that I could follow?
This is my program:
import numpy as np
import scipy
import functools
from multiprocessing import Pool
import lib as mlib
from tqdm import tqdm
class vectorize(np.vectorize):
def __get__(self, obj, objtype):
return functools.partial(self.__call__, obj)
class stability:
def __init__(self):
self.r1 = 20
self.r2 = 50
self.rz= 20
self.zeta = self.r2/self.r1
self.beta = self.rz/self.r1
self.Ms = 0.956e6
self.delta = 1
@vectorize
def f(self,ro,rs):
# print("delta=",self.delta)
return rs/ro*np.exp( (-1/self.delta)*(ro-rs))
@vectorize
def mz(self,ro,rs):
return ( 1-self.f(ro,rs)**2 ) / ( 1+self.f(ro,rs)**2 )
@vectorize
def mro(self,ro,rs):
return ( 2*self.f(ro,rs) ) / ( 1+self.f(ro,rs)**2 )
@vectorize
def E1m(self,a, b, N,rs,d):
r = np.linspace(a+(b-a)/(2*N), b-(b-a)/(2*N), N)
fx = r* ((1/self.delta+1/r)**2 * self.mro(r,rs)**2 + (1/r**2 + 1)*self.mro(r,rs)**2+d*(-(1/self.delta + 1/r) * self.mro(r,rs) + 1/r * self.mro(r,rs)*self.mz(r,rs) ))
area = np.sum(fx)*(b-a)/N
return area
if __name__ == "__main__":
rs = np.arange(0,100,1)
model = stability()
print(model.E1m(0,20,300,rs,2))
Upvotes: 5
Views: 3518
Reputation: 152677
Most of the built-in NumPy functions are already vectorized and don't need the np.vectorize
decorator at all. In general the numpy.vectorize
decorator will produce very slow results (compared to NumPy)! As the documentation mentions in the Notes section:
The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.
You can improve your codes efficiency immensely by removing the decorator from f
, mz
, and mro
. It will give the same results but run much faster (your code 10.4 seconds, changed code 0.014 seconds).
The E1m
function could also be improved (in terms of performance) by using broadcasting instead of vectorize
.
However since your question was about how to use numba.vectorize
on these functions I have some bad news: It's not possible to use numba.vectorize
on instance methods - because numba needs type information and these are not available for custom Python classes.
In general it's also best with numba to start with a pure-loop code on NumPy arrays (no vectorize) and then use the numba njit
decorator (or jit(nopython=True)
. That won't work on methods too but it's much easier to pass in scalar arguments and only iterate over the required arrays.
But if you really want to use the vectorize
approach, here's how you would do it for f
:
self
, so you need a static method or a standalone function. Because you don't have access to self
you either need to pass in delta
or make it global. I decided to make it an argument:def f(ro, rs, delta):
return rs / ro * np.exp((-1 / delta) * (ro - rs))
return_type(argument_1_type, argument_2_type, ....)
):@nb.vectorize('f8(i8, f8, f8)')
def f(ro, rs, delta):
return rs / ro * np.exp((-1 / delta) * (ro - rs))
And that's basically it.
For mz
and mro
you can do the same (remember that you also need the delta
there):
@nb.vectorize('f8(i8, f8, f8)')
def mz(ro, rs, delta):
return (1 - f(ro, rs, delta)**2) / (1 + f(ro, rs, delta)**2)
@nb.vectorize('f8(i8, f8, f8)')
def mro(ro, rs, delta):
return (2 * f(ro, rs, delta) ) / (1 + f(ro, rs, delta)**2)
Converting the E1m
function seems to be a bit trickier (I haven't attempted it) and I leave it as an exercise for the reader.
In case you're interested how I would solve it without vectorize
:
import numpy as np
import numba as nb
@nb.njit
def f(ro, rs, delta):
return rs / ro * np.exp((-1 / delta) * (ro - rs))
@nb.njit
def mz(ro, rs, delta):
f_2 = f(ro, rs, delta) ** 2
return (1 - f_2) / (1 + f_2)
@nb.njit
def mro(ro, rs, delta):
f_ = f(ro, rs, delta)
return (2 * f_ ) / (1 + f_**2)
@nb.njit(parallel=True)
def E1m(a, b, N, rs, d):
delta = 1
r = np.linspace(a + (b - a) / (2 * N), b - (b - a) / (2 * N), N)
result = np.empty(rs.size)
for idx in nb.prange(rs.size):
rs_item = rs[idx]
sum_ = 0.
for r_item in r:
mro_ = mro(r_item, rs_item, delta)
sum_ += r_item * ((1 / delta + 1 / r_item)**2 * mro_**2
+ (1 / r_item**2 + 1) * mro_**2
+ d * (-(1 / delta + 1 / r_item) * mro_
+ 1 / r_item * mro_ * mz(r_item, rs_item, delta)))
result[idx] = sum_ * (b - a) / N
return result
There's probably still a bit that could be optimized by loop-lifting or smarter approaches with the calculations but on my computer it's already pretty fast: ~100 microseconds compared to the 14 milliseconds from above, so again 100 times faster.
Upvotes: 5
Reputation: 89
MSeifert Thank You! Right now I have 40 times faster solution.
@numba.vectorize('f8(f8, f8, f8)')
def f(ro, rs, delta):
return rs / ro * np.exp((-1 / delta) * (ro - rs))
@numba.vectorize('f8(f8, f8, f8)')
def mz(ro, rs, delta):
return (1 - f(ro, rs, delta)**2) / (1 + f(ro, rs, delta)**2)
@numba.vectorize('f8(f8, f8, f8)')
def mro(ro, rs, delta):
return (2 * f(ro, rs, delta) ) / (1 + f(ro, rs, delta)**2)
@numba.vectorize(nopython=True)
def E1m(a, b, N,rs,d):
r = np.linspace(a+(b-a)/(2*N), b-(b-a)/(2*N), N)
fx = r* ((1/delta+1/r)**2 * mro(r,rs,delta)**2 + (1/r**2 + 1)*mro(r,rs,delta)**2+d*(-(1/delta + 1/r) * mro(r,rs,delta) + 1/r * mro(r,rs,delta)*mz(r,rs,delta) ))
area = np.sum(fx)*(b-a)/N
return area
x=np.arange(0,100,1)
%timeit E1m(0,20,300,x,2)
571 µs ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Upvotes: 0