Numerical Differenciation as Matrix-Vector Product

Question

I have the following code to approximate the second derivative of a function f() using the formula:

enter image description here

I would like to compare two different approaches; With loops and with matrix-vector product and hope to show that the numpy version is faster:

def get_derivative_loop(X):                             
    DDF = []
    for i in range(1,len(X)-1):
         DDF.append((f(X[i-1]) - 2*f(X[i]) + f(X[i+1]))/(h**2))
    return DDF
def get_derivative_matrix(X):
    A = (np.diag(np.ones(m)) + 
         np.diag(-2*np.ones(m-1), 1) +  
         np.diag(np.ones(m-2), 2))/(h**2)
    return np.dot(A[0:m-2], f(X))

As expected lot of time is consumed in constructing the matrix A. Is there any better solution to construct a three-diagonal Matrix in numpy?

Profiling both functions yields:

Total time: 0.003942 s
File: diff.py
Function: get_derivative_matrix at line 17

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    17                                           @profile
    18                                           def get_derivative_matrix(X):
    19         1         3584   3584.0     90.9      A = (np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
    20         1          358    358.0      9.1      return np.dot(A[0:m-2], f(X))

Total time: 0.004111 s
File: diff.py
Function: get_derivative_loop at line 22

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    22                                           @profile
    23                                           def get_derivative_loop(X):
    24         1            1      1.0      0.0      DDF = []
    25       499          188      0.4      4.6      for i in range(1, len(X)-1):
    26       498         3921      7.9     95.4          DDF.append((f(X[i-1]) - 2*f(X[i]) + f(X[i+1]))/(h**2))
    27                                           
    28         1            1      1.0      0.0      return DDF
    A = (np.diag(np.ones(m)) +
         np.diag(-2*np.ones(m-1), 1) + 
         np.diag(np.ones(m-2), 2))/(h**2)
    return np.dot(A[0:m-2], f(X))

EDIT

Although it is right, that the initialisation is done only once, so there is no need for optimisation, however I found it interesting to come up with a nice and fast way to setup that matrix.

Here are the profile-results using Divakar method

Timer unit: 1e-06 s

Total time: 0.006923 s
File: diff.py
Function: get_derivative_matrix_divakar at line 19

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    19                                           @profile
    20                                           def get_derivative_matrix_divakar(X):
    21                                           
    22                                               # Setup output array, equivalent to A
    23         1           48     48.0      0.7      out = np.zeros((m, 3+m-2))
    24                                           
    25                                               # Setup the triplets in each row as [1,-2,1]
    26         1         1485   1485.0     21.5      out[:, 0:3] = 1
    27         1           22     22.0      0.3      out[:, 1] = -2
    28                                           
    29                                               # Slice and perform matrix-multiplication
    30         1         5368   5368.0     77.5      return np.dot(out.ravel()[:m*(m-2)].reshape(m-2, -1)/(h**2), f(X))


Total time: 0.019717 s
File: diff.py
Function: get_derivative_matrix at line 45

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    45                                           @profile
    46                                           def get_derivative_matrix(X):
    47         1        18813  18813.0     95.4      A = (np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
    48         1          904    904.0      4.6      return np.dot(A[0:m-2], f(X))



Total time: 0.000108 s
File: diff.py
Function: get_derivative_slice at line 41

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    41                                           @profile
    42                                           def get_derivative_slice(X):
    43         1          108    108.0    100.0      return (f(X[0:-2]) - 2*f(X[1:-1]) + f(X[2:]))/(h**2)

The new method is faster. However, I don't understand why 21.5% is spent on this initialisation out[:, 0:3] = 1

Divakar · Accepted Answer

For m = 9, the three-diagonal matrix without the scaling by h would look like this -

array([[ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1., -2.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1., -2.,  1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1., -2.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1., -2.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  1., -2.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  1., -2.,  1.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  1., -2.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])

One can see that going row-wise, there are exactly 7 (=m-2) zeros separating two triplets of [1,-2,1]. So, as an hack, one can create a regular 2D array with those replicated triplets as the first three columns, something like this -

array([[ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1., -2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

The advantage with the above matrix creation is easy indexing, which must be pretty efficient. So, the leftover work to get us the desired output, would be slicing to restrict us to m**2 elements and taking care of the triplets at the end.

At the end, we would have something like this to get the three-diagonal matrix -

def three_diag_mat(m,h):
    # Initialize output array
    out = np.zeros((m,3+m-2))

    # Setup the triplets in each row as [1,-2,1]
    out[:,:3] = 1
    out[:,1] = -2

    # Reset the ending "1" of the second last row as zero.
    out[m-2,2] = 0

    # Slice upto m**2 elements in a flattened version.
    # Then, scale down the sliced output by h**2 for the final output. 
    return (out.ravel()[:m**2].reshape(m,m))/(h**2)

Runtime tests and verify results

Case #1:

In [8]: m = 100; h = 10

In [9]: %timeit (np.diag(np.ones(m)) + 
np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
10000 loops, best of 3: 119 µs per loop

In [10]: %timeit three_diag_mat(m,h)
10000 loops, best of 3: 51.8 µs per loop

In [11]: np.array_equal((np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) +
 np.diag(np.ones(m-2), 2))/(h**2),three_diag_mat(m,h))
Out[11]: True

Case #2:

In [12]: m = 1000; h = 10

In [13]: %timeit (np.diag(np.ones(m)) + 
np.diag(-2*np.ones(m-1), 1) + np.diag(np.ones(m-2), 2))/(h**2)
100 loops, best of 3: 16.2 ms per loop

In [14]: %timeit three_diag_mat(m,h)
100 loops, best of 3: 5.66 ms per loop

In [15]: np.array_equal((np.diag(np.ones(m)) + np.diag(-2*np.ones(m-1), 1) + 
np.diag(np.ones(m-2), 2))/(h**2),three_diag_mat(m,h))

Specific use-case: For the use-case, where you are using A[0:m-2], you can avoid few calculations and have a modified get_derivative_matrix like so :

def get_derivative_matrix(X):

    # Setup output array, equivalent to A
    out = np.zeros((m,3+m-2))

    # Setup the triplets in each row as [1,-2,1]
    out[:,:3] = 1
    out[:,1] = -2

    # Slice and perform matrix-multiplication
    return np.dot(out.ravel()[:m*(m-2)].reshape(m-2,-1)/(h**2), f(X))

Numerical Differenciation as Matrix-Vector Product

Answers (2)

Related Questions