Reputation: 9170

Optimize A*x = B solution for a tridiagonal coefficient matrix

I have a system of equations in the form of A*x = B where [A] is a tridiagonal coefficient matrix. Using the Numpy solver numpy.linalg.solve I can solve the system of equations for x.

See example below of how I develop the tridiagonal [A] martix. the {B} vector, and solve for x:

# Solve system of equations with a tridiagonal coefficient matrix
# uses numpy.linalg.solve

# use Python 3 print function
from __future__ import print_function
from __future__ import division

# modules
import numpy as np
import time

ti = time.clock()

#---- Build [A] array and {B} column vector

m = 1000   # size of array, make this 8000 to see time benefits

A = np.zeros((m, m))     # pre-allocate [A] array
B = np.zeros((m, 1))     # pre-allocate {B} column vector

A[0, 0] = 1
A[0, 1] = 2
B[0, 0] = 1

for i in range(1, m-1):
    A[i, i-1] = 7   # node-1
    A[i, i] = 8     # node
    A[i, i+1] = 9   # node+1
    B[i, 0] = 2

A[m-1, m-2] = 3
A[m-1, m-1] = 4
B[m-1, 0] = 3

print('A \n', A)
print('B \n', B)

#---- Solve using numpy.linalg.solve

x = np.linalg.solve(A, B)     # solve A*x = B for x

print('x \n', x)

#---- Elapsed time for each approach

print('NUMPY time', time.clock()-ti, 'seconds')

So my question relates to two sections of the above example:

Since I am dealing with a tridiagonal matrix for [A], also called a banded matrix, is there a more efficient way to solve the system of equations instead of using numpy.linalg.solve?
Also, is there a better way to create the tridiagonal matrix instead of using a for-loop?

The above example runs on Linux in about 0.08 seconds according to the time.clock() function.

The numpy.linalg.solve function works fine, but I'm trying to find an approach that takes advantage of the tridiagonal form of [A] in hopes of speeding up the solution even further and then apply that approach to a more complicated example.

Upvotes: 6

Answers (4)

Khalil Al Hooti

Reputation: 4506

This probably will help There is a function creates_tridiagonal which will create tridiagonal matrix. There is another function which converts a matrix into diagonal ordered form as requested by SciPy solve_banded function.

import numpy as np    

def lu_decomp3(a):
    """
    c,d,e = lu_decomp3(a).
    LU decomposition of tridiagonal matrix a = [c\d\e]. On output
    {c},{d} and {e} are the diagonals of the decomposed matrix a.
    """
    n = np.diagonal(a).size
    assert(np.all(a.shape ==(n,n))) # check if square matrix

    d = np.copy(np.diagonal(a)) # without copy (assignment destination is read-only) error is raised 
    e = np.copy(np.diagonal(a, 1))
    c = np.copy(np.diagonal(a, -1)) 

    for k in range(1,n):
        lam = c[k-1]/d[k-1]
        d[k] = d[k] - lam*e[k-1]
        c[k-1] = lam
    return c,d,e

def lu_solve3(c,d,e,b):
    """
    x = lu_solve(c,d,e,b).
    Solves [c\d\e]{x} = {b}, where {c}, {d} and {e} are the
    vectors returned from lu_decomp3.
    """
    n = len(d)
    y = np.zeros_like(b)

    y[0] = b[0]
    for k in range(1,n): 
        y[k] = b[k] - c[k-1]*y[k-1]

    x = np.zeros_like(b)
    x[n-1] = y[n-1]/d[n-1] # there is no x[n] out of range
    for k in range(n-2,-1,-1):
        x[k] = (y[k] - e[k]*x[k+1])/d[k]
    return x

from scipy.sparse import diags
def create_tridiagonal(size = 4):
    diag = np.random.randn(size)*100
    diag_pos1 = np.random.randn(size-1)*10
    diag_neg1 = np.random.randn(size-1)*10

    a = diags([diag_neg1, diag, diag_pos1], offsets=[-1, 0, 1],shape=(size,size)).todense()
    return a

a = create_tridiagonal(4)
b = np.random.randn(4)*10

print('matrix a is\n = {} \n\n and vector b is \n {}'.format(a, b))

c, d, e = lu_decomp3(a)
x = lu_solve3(c, d, e, b)

print("x from our function is {}".format(x))

print("check is answer correct ({})".format(np.allclose(np.dot(a, x), b)))


## Test Scipy
from scipy.linalg import solve_banded

def diagonal_form(a, upper = 1, lower= 1):
    """
    a is a numpy square matrix
    this function converts a square matrix to diagonal ordered form
    returned matrix in ab shape which can be used directly for scipy.linalg.solve_banded
    """
    n = a.shape[1]
    assert(np.all(a.shape ==(n,n)))

    ab = np.zeros((2*n-1, n))

    for i in range(n):
        ab[i,(n-1)-i:] = np.diagonal(a,(n-1)-i)

    for i in range(n-1): 
        ab[(2*n-2)-i,:i+1] = np.diagonal(a,i-(n-1))


    mid_row_inx = int(ab.shape[0]/2)
    upper_rows = [mid_row_inx - i for i in range(1, upper+1)]
    upper_rows.reverse()
    upper_rows.append(mid_row_inx)
    lower_rows = [mid_row_inx + i for i in range(1, lower+1)]
    keep_rows = upper_rows+lower_rows
    ab = ab[keep_rows,:]


    return ab

ab = diagonal_form(a, upper=1, lower=1) # for tridiagonal matrix upper and lower = 1

x_sp = solve_banded((1,1), ab, b)
print("is our answer the same as scipy answer ({})".format(np.allclose(x, x_sp)))

Upvotes: 0

Chris Hagmann

Reputation: 1096

You could use scipy.linalg.solveh_banded.

EDIT: You CANNOT used the above as your matrix is not symmetric and I thought it was. However, as was mentioned above in the comment, the Thomas algorithm is great for this

a =       [7] * ( m - 2 ) + [3]
b = [1] + [8] * ( m - 2 ) + [4]
c = [2] + [9] * ( m - 2 )
d = [1] + [2] * ( m - 2 ) + [3]

# This is taken directly from the Wikipedia page also cited above
# this overwrites b and d
def TDMASolve(a, b, c, d):
    n = len(d) # n is the numbers of rows, a and c has length n-1
    for i in xrange(n-1):
        d[i+1] -= 1. * d[i] * a[i] / b[i]
        b[i+1] -= 1. * c[i] * a[i] / b[i]
    for i in reversed(xrange(n-1)):
        d[i] -= d[i+1] * c[i] / b[i+1]
    return [d[i] / b[i] for i in xrange(n)]

This code is not optimize nor does it use np, but if I (or any of the other fine folks here) have time, I will edit it so that it does those thing. It currently times at ~10 ms for m=10000.

Upvotes: 1

eickenberg

Reputation: 14377

There is a scipy.sparse matrix type called scipy.sparse.dia_matrix which captures the structure of your matrix well (it will store 3 arrays, in "positions" 0 (diagonal), 1 (above) and -1 (below)). Using this type of matrix you can try scipy.sparse.linalg.lsqr for solving. If your problem has an exact solution, it will be found, otherwise it will find the solution in least squares sense.

from scipy import sparse
A_sparse = sparse.dia_matrix(A)
ret_values = sparse.linalg.lsqr(A_sparse, C)
x = ret_values[0]

However, this may not be completely optimal in terms of exploiting the triadiagonal structure, there may be a theoretical way of making this faster. What this conversion does do for you is cut down the matrix multiplication expenses to the essential: Only the 3 bands are used. This, in combination with the iterative solver lsqr should already yield a speedup.

Note: I am not proposing scipy.sparse.linalg.spsolve, because it converts your matrix to csr format. However, replacing lsqr with spsolve is worth a try, especially because spsolve can bind UMFPACK, see relevant doc on spsolve. Also, it may be of interest to take a look at this stackoverflow question and answer relating to UMFPACK

Upvotes: 1

Craig J Copi

Reputation: 1764

There are two immediate performance improvements (1) do not use a loop, (2) use scipy.linalg.solve_banded().

I would write the code something more like

import scipy.linalg as la

# Create arrays and set values
ab = np.zeros((3,m))
b = 2*ones(m)
ab[0] = 9
ab[1] = 8
ab[2] = 7

# Fix end points
ab[0,1] = 2
ab[1,0] = 1
ab[1,-1] = 4
ab[2,-2] = 3
b[0] = 1
b[-1] = 3

return la.solve_banded ((1,1),ab,b)

There may be more elegant ways to construct the matrix, but this works.

Using %timeit in ipython the original code took 112 ms for m=1000. This code takes 2.94 ms for m=10,000, an order of magnitude larger problem yet still almost two orders of magnitude faster! I did not have the patience to wait on the original code for m=10,000. Most of the time in the original may be in constructing the array, I did not test this. Regardless, for large arrays it is much more efficient to only store the non-zero values of the matrix.

Upvotes: 4

Optimize A*x = B solution for a tridiagonal coefficient matrix

Answers (4)

Related Questions