Giacomo Petrillo
Giacomo Petrillo

Reputation: 367

More efficient way to invert a matrix knowing it is symmetric and positive semi-definite

I'm inverting covariance matrices with numpy in python. Covariance matrices are symmetric and positive semi-definite.

I wondered if there exists an algorithm optimised for symmetric positive semi-definite matrices, faster than numpy.linalg.inv() (and of course if an implementation of it is readily accessible from python!). I did not manage to find something in numpy.linalg or searching the web.

EDIT:

As observed by @yixuan, positive semi-definite matrices are not in general strictly invertible. I checked that in my case I just got positive definite matrices, so I accepted an answer that works for positive definiteness. Anyway, in the LAPACK low-level routines I found DSY* routines that are optimised for just simmetric/hermitian matrices, although it seems they are missing in scipy (maybe it is just a matter of installed versions).

Upvotes: 7

Views: 11720

Answers (4)

Kerrick Staley
Kerrick Staley

Reputation: 1943

I tried out @percusse's answer, but when I timed its execution, I found that it was about 33% slower than np.linalg.inv (using a sample of 100K random positive definite 4x4 np.float64 matrices). Here is my implementation:

from scipy.linalg import lapack

def upper_triangular_to_symmetric(ut):
    ut += np.triu(ut, k=1).T

def fast_positive_definite_inverse(m):
    cholesky, info = lapack.dpotrf(m)
    if info != 0:
        raise ValueError('dpotrf failed on input {}'.format(m))
    inv, info = lapack.dpotri(cholesky)
    if info != 0:
        raise ValueError('dpotri failed on input {}'.format(cholesky))
    upper_triangular_to_symmetric(inv)
    return inv

I tried profiling it, and to my surprise, it spends about 82% of its time calling upper_triangular_to_symmetric (which is not the "hard" part)! I think this happens because it's doing floating point addition in order to combine the matrices, instead of a simple copy.

I tried a upper_triangular_to_symmetric implementation that is about 87% faster (see this question and answer):

from scipy.linalg import lapack

inds_cache = {}

def upper_triangular_to_symmetric(ut):
    n = ut.shape[0]
    try:
        inds = inds_cache[n]
    except KeyError:
        inds = np.tri(n, k=-1, dtype=np.bool)
        inds_cache[n] = inds
    ut[inds] = ut.T[inds]


def fast_positive_definite_inverse(m):
    cholesky, info = lapack.dpotrf(m)
    if info != 0:
        raise ValueError('dpotrf failed on input {}'.format(m))
    inv, info = lapack.dpotri(cholesky)
    if info != 0:
        raise ValueError('dpotri failed on input {}'.format(cholesky))
    upper_triangular_to_symmetric(inv)
    return inv

This version is about 68% faster than np.linalg.inv and only spends about 42% of its time calling upper_triangular_to_symmetric.

Upvotes: 8

percusse
percusse

Reputation: 3106

The API doesn't exist yet but you can use the low level LAPACK ?POTRI routine family for it.

The docstring of sp.linalg.lapack.dpotri is as follows

Docstring:     
inv_a,info = dpotri(c,[lower,overwrite_c])

Wrapper for ``dpotri``.

Parameters
----------
c : input rank-2 array('d') with bounds (n,n)

Other Parameters
----------------
overwrite_c : input int, optional
    Default: 0
lower : input int, optional
    Default: 0

Returns
-------
inv_a : rank-2 array('d') with bounds (n,n) and c storage
info : int
Call signature: sp.linalg.lapack.dpotri(*args, **kwargs)

The most important is the info output. If it is zero that means it solved the equation succesfully regardless of positive definiteness. Because this is low-level call no other checks are performed.

>>>> M = np.random.rand(10,10)
>>>> M = M + M.T
>>>> # Make it pos def
>>>> M += (1.5*np.abs(np.min(np.linalg.eigvals(M))) + 1) * np.eye(10)
>>>> zz , _ = sp.linalg.lapack.dpotrf(M, False, False)
>>>> inv_M , info = sp.linalg.lapack.dpotri(zz)
>>>> # lapack only returns the upper or lower triangular part 
>>>> inv_M = np.triu(inv_M) + np.triu(inv_M, k=1).T

Also if you compare the speed

>>>> %timeit sp.linalg.lapack.dpotrf(M)
The slowest run took 17.86 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.15 µs per loop

>>>> %timeit sp.linalg.lapack.dpotri(M)
The slowest run took 24.09 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.08 µs per loop

>>>> III = np.eye(10)

>>>> %timeit sp.linalg.solve(M,III, sym_pos=1,overwrite_b=1)
10000 loops, best of 3: 40.6 µs per loop

So you get a quite nonnegligible speed benefit. If you are working with complex numbers, then you have to use zpotri instead.

The question is whether you need the inverse or not. You probably don't if you need to compute B⁻¹ * A because solve(B,A) is better for that.

Upvotes: 5

yixuan
yixuan

Reputation: 834

First of all you need to make sure that the covariance matrix is positive definite (p.d.) rather than semi-definite, otherwise the matrix is not invertible.

Without the p.d. assumption, matrix inversion is usually done by the LU decomposition, while for p.d. matrices, the Cholesky decomposition can be used, which generally reduces computation cost.

In Scipy, the linalg.solve() function has a parameter sym_pos that assumes the matrix is p.d.. Below is a quick example:

import numpy as np
from scipy import linalg
import time

def pd_inv(a):
    n = a.shape[0]
    I = np.identity(n)
    return linalg.solve(a, I, sym_pos = True, overwrite_b = True)

n = 50
A = np.random.rand(n, n)
B = A.dot(A.T)

start = time.clock()
for i in xrange(0, 10000):
    res = linalg.inv(B)

end = time.clock()
print end - start
## 1.324752

start = time.clock()
for i in xrange(0, 10000):
    res = pd_inv(B)

end = time.clock()
print end - start
## 1.109778

On my machine, pd_inv() has some advantage for small matrices (~100x100). For large matrices there is hardly any difference, and sometimes pd_inv() is even slower.

Upvotes: 6

mmdanziger
mmdanziger

Reputation: 4658

To my knowledge there is not a standard matrix inverse function for symmetric matrices. In general you need more constraints on sparseness etc. to get good speed-ups for your solvers. However, if you look at scipy.linalg you'll see there are some eigenvalue routines that are optimized for Hermitian (symmetric) matrices.

For example, when I generate a random 200x200 dense matrix and solve the eigenvalues I get:

from scipy.linalg import inv,pinvh,eig,eigh
B = np.rand(200,200)
B = B+B.T
%timeit inv(B)
1000 loops, best of 3: 915 µs per loop

%timeit pinvh(B)
100 loops, best of 3: 6.93 ms per loop

So no advantage on the inverse but:

%timeit eig(B)
10 loops, best of 3: 39.1 ms per loop

%timeit eigh(B)
100 loops, best of 3: 4.9 ms per loop

a cool 8x speedup on eigenvalues.

If your matrix is sparse, you should check out scipy.sparse.linalg which has a handful of solvers, some of which (like bicg and cg) require Hermitian matrices and so may be faster. However, it's only sensible if your matrix is sparse, only solves for a particular solution vector b and may not actually be faster, depending on the matrix structure. You'll really have to benchmark it to find out.

I asked a similar question about C++ solvers and ultimately found that it's really application dependent and you have to pick the best solver for your problem.

Upvotes: 5

Related Questions