Reputation: 31
Consider
f = lambda x : 1/x
and I want to get its definite integral between 2 and 7.
The first method is using a linspace and evaluating a Riemann Sum over 10^4 terms.
l = list(np.linspace(2,7,10**4))
s = 0
for i in l:
s+=f(i)*(l[1]-l[0])
The second method is using SymPy's integrate function and evaluating it.
x = sp.symbols('x')
t = sp.integrate(f(x),(x,2,7)).evalf()
The output gives us :
Riemann Sum : 1.2529237036377492
--- 13.025045394897461 milliseconds ---
SymPy : 1.25276296849537
--- 71.07734680175781 milliseconds ---
Delta : 0.0128304512843464 %
My question is: Why is sympyaround 4 to 5 times slower than a Riemann Sum for a delta <.1% and is there any way to improve any of the two methods ?
Upvotes: 0
Views: 302
Reputation: 231605
sympy
is a symbolic/algebraic package, manipulating complex "symbol/expression" objects.
In an isympy
session:
In [7]: f = lambda x : 1/x
In [8]: integrate(f(x),(x,2,7)).evalf()
Out[8]: 1.25276296849537
In [9]: integrate(f(x),(x,2,7))
Out[9]: -log(2) + log(7)
In [10]: timeit integrate(f(x),(x,2,7))
10.6 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [11]: timeit integrate(f(x),(x,2,7)).evalf()
10.8 ms ± 13.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The bulk of the time is spent in the symbolic part, with the final numeric evaluation being relatively fast.
Your iterative numeric solution:
In [45]: f = lambda x : 1/x
In [46]: %%timeit
...: s = 0
...: for i in l:
...: s+=f(i)*(l[1]-l[0])
...:
5.91 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
But using numpy
we can do that a lot faster:
In [47]: (f(np.array(l))*(l[1]-l[0])).sum()
Out[47]: 1.2529237036377558
In [48]: timeit (f(np.array(l))*(l[1]-l[0])).sum()
631 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
and even better if the input is an array already (your linspace without the `tolist()):
In [49]: %%timeit larr=np.array(l)
...: (f(larr)*(l[1]-l[0])).sum()
61.2 µs ± 735 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
scipy
has a bunch of integration functions, most of which use compiled libraries like QUADPACK. A basic one is quad
:
In [50]: from scipy.integrate import quad
In [52]: quad(f,2,7)
Out[52]: (1.2527629684953678, 3.2979213205748694e-12)
In [53]: timeit quad(f,2,7)
7.22 µs ± 57.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
According the full_output
display quad
only has to call f()
21 times, rather than the 10**4 your iteration does.
Upvotes: 1