Reputation: 158
I'm currently working on kernel methods, and at some point I needed to make a non positive semi-definite matrix (i.e. similarity matrix) into one PSD matrix. I tried this approach:
def makePSD(mat):
#make symmetric
k = (mat+mat.T)/2
#make PSD
min_eig = np.min(np.real(linalg.eigvals(mat)))
e = np.max([0, -min_eig + 1e-4])
mat = k + e*np.eye(mat.shape[0]);
return mat
but it fails if I test the resulting matrix with the following function:
def isPSD(A, tol=1e-8):
E,V = linalg.eigh(A)
return np.all(E >= -tol)
I also tried the approach suggested in other related question (How can I calculate the nearest positive semi-definite matrix?), but the resulting matrix also failed to pass the isPSD test.
Do you have any suggestions on how to correctly make such transformation correctly?
Upvotes: 10
Views: 21298
Reputation: 111
Just in case someone else faces the same problem. The two methods explained above by @tjiagoM and @Ahmed Fasih are equivalent and should both give the same psd matrix that minimizes the Frobenius norm. In equations (2.1) and (2.2) of the paper
'''
[2] N.J. Higham, "Computing a nearest symmetric positive semidefinite
matrix" (1988): https://doi.org/10.1016/0024-3795(88)90223-6
'''
linked in the answer of Ahmed, the calculation method used by tjiagoM is described and actually referred to as the preferred way of computing the closest psd matrix.
Upvotes: 1
Reputation: 6927
First thing I’d say is don’t use eigh
for testing positive-definiteness, since eigh
assumes the input is Hermitian. That’s probably why you think the answer you reference isn’t working.
I didn’t like that answer because it had an iteration (and, I couldn’t understand its example), nor the other answer there it doesn’t promise to give you the best positive-definite matrix, i.e., the one closest to the input in terms of the Frobenius norm (squared-sum of elements). (I have absolutely no idea what your code in your question is supposed to do.)
I do like this Matlab implementation of Higham’s 1988 paper: https://www.mathworks.com/matlabcentral/fileexchange/42885-nearestspd so I ported it to Python (edit updated for Python 3):
from numpy import linalg as la
import numpy as np
def nearestPD(A):
"""Find the nearest positive-definite matrix to input
A Python/Numpy port of John D'Errico's `nearestSPD` MATLAB code [1], which
credits [2].
[1] https://www.mathworks.com/matlabcentral/fileexchange/42885-nearestspd
[2] N.J. Higham, "Computing a nearest symmetric positive semidefinite
matrix" (1988): https://doi.org/10.1016/0024-3795(88)90223-6
"""
B = (A + A.T) / 2
_, s, V = la.svd(B)
H = np.dot(V.T, np.dot(np.diag(s), V))
A2 = (B + H) / 2
A3 = (A2 + A2.T) / 2
if isPD(A3):
return A3
spacing = np.spacing(la.norm(A))
# The above is different from [1]. It appears that MATLAB's `chol` Cholesky
# decomposition will accept matrixes with exactly 0-eigenvalue, whereas
# Numpy's will not. So where [1] uses `eps(mineig)` (where `eps` is Matlab
# for `np.spacing`), we use the above definition. CAVEAT: our `spacing`
# will be much larger than [1]'s `eps(mineig)`, since `mineig` is usually on
# the order of 1e-16, and `eps(1e-16)` is on the order of 1e-34, whereas
# `spacing` will, for Gaussian random matrixes of small dimension, be on
# othe order of 1e-16. In practice, both ways converge, as the unit test
# below suggests.
I = np.eye(A.shape[0])
k = 1
while not isPD(A3):
mineig = np.min(np.real(la.eigvals(A3)))
A3 += I * (-mineig * k**2 + spacing)
k += 1
return A3
def isPD(B):
"""Returns true when input is positive-definite, via Cholesky"""
try:
_ = la.cholesky(B)
return True
except la.LinAlgError:
return False
if __name__ == '__main__':
import numpy as np
for i in range(10):
for j in range(2, 100):
A = np.random.randn(j, j)
B = nearestPD(A)
assert (isPD(B))
print('unit test passed!')
In addition to just finding the nearest positive-definite matrix, the above library includes isPD
which uses the Cholesky decomposition to determine whether a matrix is positive-definite. This way, you don’t need any tolerances—any function that wants a positive-definite will run Cholesky on it, so it’s the absolute best way to determine positive-definiteness.
It also has a Monte Carlo-based unit test at the end. If you put this in posdef.py
and run python posdef.py
, it’ll run a unit-test that passes in ~a second on my laptop. Then in your code you can import posdef
and call posdef.nearestPD
or posdef.isPD
.
The code is also in a Gist if you do that.
Upvotes: 26
Reputation: 446
I know this thread is kinda old, but just wanted to say that the question linked by @user1231818 now has a satisfactory answer, at least in the cases I've tested: https://stackoverflow.com/a/63131250/4733085
I'm leaving here the code, but for more details just follow the link:
import numpy as np
def get_near_psd(A):
C = (A + A.T)/2
eigval, eigvec = np.linalg.eig(C)
eigval[eigval < 0] = 0
return eigvec.dot(np.diag(eigval)).dot(eigvec.T)
Upvotes: 2