Reputation: 371
I am trying to compute the inverse of a full-rank matrix using numpy, but when I test the dot product, I find that it does not result in the identity matrix - which means it did not invert properly.
My code:
H = calculateLogisticHessian(theta, X) #returns a 5x5 matrix
Hinv = np.linalg.inv(H)
print("H = " + str(H))
print("Hinv = " + str(Hinv))
I = np.dot(H, Hinv)
isIdentity = np.allclose(I , np.eye(5))
print("invdotinv = " + str(isIdentity) + "\n" + str(I))
and the output:
H = [[ 77.88167948 81.49914902 85.11661855 88.73408809 92.35155763]
[ 81.49914902 85.36097831 89.2228076 93.0846369 96.94646619]
[ 85.11661855 89.2228076 93.32899665 97.4351857 101.54137475]
[ 88.73408809 93.0846369 97.4351857 101.7857345 106.1362833 ]
[ 92.35155763 96.94646619 101.54137475 106.1362833 110.73119186]]
Hinv = [[ 1.41918134e+02 1.00000206e+08 -1.00000632e+08 -9.99999204e+07
1.00000205e+08]
[ 1.00000347e+08 1.00000647e+08 -4.00001421e+08 9.99994941e+07
1.00000932e+08]
[ -1.00000916e+08 -4.00001424e+08 8.00003700e+08 5.68436971e+02
-3.00001928e+08]
[ -9.99997780e+07 1.00000065e+08 -5.72321511e+02 1.00000063e+08
-9.99997769e+07]
[ 1.00000205e+08 1.00000505e+08 -3.00001073e+08 -1.00000205e+08
2.00000567e+08]]
invdotinv = False
[[ 1.00000000e+00 -3.81469727e-06 -7.62939453e-06 3.81469727e-06
3.81469727e-06]
[ 0.00000000e+00 1.00000191e+00 -1.52587891e-05 3.81469727e-06
0.00000000e+00]
[ -3.81469727e-06 1.90734863e-06 9.99992371e-01 3.81469727e-06
3.81469727e-06]
[ 1.90734863e-06 -1.90734863e-06 -7.62939453e-06 1.00000191e+00
3.81469727e-06]
[ 0.00000000e+00 -1.90734863e-06 0.00000000e+00 0.00000000e+00
1.00000000e+00]]
As you can see the np.dot(H, Hinv)
matrix does not return identity and results in False
when evaluating np.allclose(I , np.eye(5))
.
What am I doing wrong?
Later edit
this is the function which calculates the hessian:
def calculateLogisticHessian(theta, X):
'''
calculate the hessian matrix based on given function, assuming it is some king of logistic funciton
:param theta: the weights
:param x: 2d array of arguments
:return: the hessian matrix
'''
m, n = X.shape
H = np.zeros((n,n))
for i in range(0,m):
hxi = h(theta, X[i]) #in case of logistic, will return p(y|x)
xiDotxiT = np.outer(X[i], np.transpose(X[i]))
hxiTimesOneMinHxi = hxi*(1-hxi)
currh = np.multiply(hxiTimesOneMinHxi, xiDotxiT)
H = np.add(H, currh)
return np.divide(H, m)
which should be according to the hessian calculation formula in andrew ng's video regarding newtons method for logistic regression:
https://youtu.be/fF-6QnVB-7E?t=5m6s at 5:06
1/m * (SUM from i=1 till m of[h(X[i]) * (1 - h(X[i]) * (X[i] * X[i]'T)])
where X is the 2x2 matrix of data and h() is the function based on theta (theta is the weigts) which in this case returns the logistic function.
the inputs I used:
theta = np.array([0.001, 0.002, 0.003, 0.004, 0.005])
X = np.array(range(5*7))
X = X.reshape((7,5))
H = calculateLogisticHessian(theta, X)
so is there an error in the way I iplemented the hessian formula or is the issue in the inputs, and what is the issue?
Thanks!
Upvotes: 1
Views: 691
Reputation: 18628
Hessian matrix are often ill-conditioned. numpy.linalg.cond
lets you compute the condition number:
In [188]: np.linalg.cond(H)
Out[188]: 522295671550.72644
Since the condition number of H
is large, computing its inverse has rounding issues .
Upvotes: 2