Reputation: 2809
I have the following code to approximate the second derivative of a function f()
using the formula:
I would like to compare two different approaches; With loops and with matrix-vector product and hope to show that the numpy
version is faster:
def get_derivative_loop(X):
DDF = []
for i in range(1,len(X)-1):
DDF.append((f(X[i-1]) - 2*f(X[i]) + f(X[i+1]))/(h**2))
return DDF
def get_derivative_matrix(X):
A = (np.diag(np.ones(m)) +
np.diag(-2*np.ones(m-1), 1) +
np.diag(np.ones(m-2), 2))/(h**2)
return np.dot(A[0:m-2], f(X))
As expected lot of time is consumed in constructing the matrix A
. Is there any better solution to construct a three-diagonal Matrix in numpy?
Profiling both functions yields:
Total time: 0.003942 s
File: diff.py
Function: get_derivative_matrix at line 17
Line # Hits Time Per Hit % Time Line Contents
==============================================================
17 @profile
18 def get_derivative_matrix(X):
19 1 3584 3584.0 90.9 A = (np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
20 1 358 358.0 9.1 return np.dot(A[0:m-2], f(X))
Total time: 0.004111 s
File: diff.py
Function: get_derivative_loop at line 22
Line # Hits Time Per Hit % Time Line Contents
==============================================================
22 @profile
23 def get_derivative_loop(X):
24 1 1 1.0 0.0 DDF = []
25 499 188 0.4 4.6 for i in range(1, len(X)-1):
26 498 3921 7.9 95.4 DDF.append((f(X[i-1]) - 2*f(X[i]) + f(X[i+1]))/(h**2))
27
28 1 1 1.0 0.0 return DDF
A = (np.diag(np.ones(m)) +
np.diag(-2*np.ones(m-1), 1) +
np.diag(np.ones(m-2), 2))/(h**2)
return np.dot(A[0:m-2], f(X))
EDIT
Although it is right, that the initialisation is done only once, so there is no need for optimisation, however I found it interesting to come up with a nice and fast way to setup that matrix.
Here are the profile-results using Divakar
method
Timer unit: 1e-06 s
Total time: 0.006923 s
File: diff.py
Function: get_derivative_matrix_divakar at line 19
Line # Hits Time Per Hit % Time Line Contents
==============================================================
19 @profile
20 def get_derivative_matrix_divakar(X):
21
22 # Setup output array, equivalent to A
23 1 48 48.0 0.7 out = np.zeros((m, 3+m-2))
24
25 # Setup the triplets in each row as [1,-2,1]
26 1 1485 1485.0 21.5 out[:, 0:3] = 1
27 1 22 22.0 0.3 out[:, 1] = -2
28
29 # Slice and perform matrix-multiplication
30 1 5368 5368.0 77.5 return np.dot(out.ravel()[:m*(m-2)].reshape(m-2, -1)/(h**2), f(X))
Total time: 0.019717 s
File: diff.py
Function: get_derivative_matrix at line 45
Line # Hits Time Per Hit % Time Line Contents
==============================================================
45 @profile
46 def get_derivative_matrix(X):
47 1 18813 18813.0 95.4 A = (np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
48 1 904 904.0 4.6 return np.dot(A[0:m-2], f(X))
Total time: 0.000108 s
File: diff.py
Function: get_derivative_slice at line 41
Line # Hits Time Per Hit % Time Line Contents
==============================================================
41 @profile
42 def get_derivative_slice(X):
43 1 108 108.0 100.0 return (f(X[0:-2]) - 2*f(X[1:-1]) + f(X[2:]))/(h**2)
The new method is faster. However, I don't understand why 21.5%
is spent on this initialisation out[:, 0:3] = 1
Upvotes: 2
Views: 99
Reputation: 221574
For m = 9
, the three-diagonal matrix without the scaling by h
would look like this -
array([[ 1., -2., 1., 0., 0., 0., 0., 0., 0.],
[ 0., 1., -2., 1., 0., 0., 0., 0., 0.],
[ 0., 0., 1., -2., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 1., -2., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 1., -2., 1., 0., 0.],
[ 0., 0., 0., 0., 0., 1., -2., 1., 0.],
[ 0., 0., 0., 0., 0., 0., 1., -2., 1.],
[ 0., 0., 0., 0., 0., 0., 0., 1., -2.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 1.]])
One can see that going row-wise, there are exactly 7
(=m-2) zeros separating two triplets of [1,-2,1]
. So, as an hack, one can create
a regular 2D array with those replicated triplets as the first three columns, something like this -
array([[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 1., -2., 1., 0., 0., 0., 0., 0., 0., 0.]])
The advantage with the above matrix creation is easy indexing, which must be pretty efficient. So, the leftover work to get us the desired output, would be slicing to restrict us to m**2
elements and taking care of the triplets at the end.
At the end, we would have something like this to get the three-diagonal matrix -
def three_diag_mat(m,h):
# Initialize output array
out = np.zeros((m,3+m-2))
# Setup the triplets in each row as [1,-2,1]
out[:,:3] = 1
out[:,1] = -2
# Reset the ending "1" of the second last row as zero.
out[m-2,2] = 0
# Slice upto m**2 elements in a flattened version.
# Then, scale down the sliced output by h**2 for the final output.
return (out.ravel()[:m**2].reshape(m,m))/(h**2)
Runtime tests and verify results
Case #1:
In [8]: m = 100; h = 10
In [9]: %timeit (np.diag(np.ones(m)) +
np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
10000 loops, best of 3: 119 µs per loop
In [10]: %timeit three_diag_mat(m,h)
10000 loops, best of 3: 51.8 µs per loop
In [11]: np.array_equal((np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) +
np.diag(np.ones(m-2), 2))/(h**2),three_diag_mat(m,h))
Out[11]: True
Case #2:
In [12]: m = 1000; h = 10
In [13]: %timeit (np.diag(np.ones(m)) +
np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
100 loops, best of 3: 16.2 ms per loop
In [14]: %timeit three_diag_mat(m,h)
100 loops, best of 3: 5.66 ms per loop
In [15]: np.array_equal((np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) +
np.diag(np.ones(m-2), 2))/(h**2),three_diag_mat(m,h))
Specific use-case: For the use-case, where you are using A[0:m-2]
, you can avoid few calculations and have a modified get_derivative_matrix
like so :
def get_derivative_matrix(X):
# Setup output array, equivalent to A
out = np.zeros((m,3+m-2))
# Setup the triplets in each row as [1,-2,1]
out[:,:3] = 1
out[:,1] = -2
# Slice and perform matrix-multiplication
return np.dot(out.ravel()[:m*(m-2)].reshape(m-2,-1)/(h**2), f(X))
Upvotes: 1
Reputation: 2491
There is no need to construct the matrix. You can directly use the vector f. e.g. following version will be fine
def get_derivative(x,f,h):
fx=f(x)
return (fx[:-2]-2*fx[1:-1]+fx[2:])/h**2
The matrix method will be useful in case of repetitive derivative calculations. You store the matrix and reuse it every time. It becomes more useful for higher order accuracy.
Upvotes: 0