Greg Hennessy
Greg Hennessy

Reputation: 527

How do I associate which singular value corresponds to what entry?

I am using the numpy linalg routine lstsq to solve system of equations. My A matrix is size of (11046, 504) while my B matrix is size (11046, 1), and the rank determined is 249, so about half of the solved for x array isn't particularly useful. I'd like to use the s array of singular values to zero out the solved for parameters that correspond to the singular values, but it seems that the s array is sorted in order of decreasing statistical significance. Is there a way I can figure out which of my x's correspond to each singular value s?

Upvotes: 5

Views: 2152

Answers (1)

ThePhysicist
ThePhysicist

Reputation: 1862

To obtain the least-squares solution to the equation Mb = x as given by numpy.linalg.lstsq, you can also use numpy.linalg.svd, which calculates the singular-value decomposition M= U S V*. The best solution x is then given as x = V Sp U* b, where Sp is the pseudo-inverse of S. Given the matrices U and V* (containing the left- and right-singular vectors of your matrix M) and the singular values s, you can calculate the vector z=V*x. Now, all components z_i of z with i > rank(M) can be chosen arbitarily without altering the solution, as can thus all components x_j which are not contained in z_i for i <= rank(M).

Here's an example that demonstrates how to obtain the significant components of x, using sample data from the Wikipedia entry on singluar value decomposition:

import numpy as np

M = np.array([[1,0,0,0,2],[0,0,3,0,0],[0,0,0,0,0],[0,4,0,0,0]])

#We perform singular-value decomposition of M
U, s, V = np.linalg.svd(M)

S = np.zeros(M.shape,dtype = np.float64)

b = np.array([1,2,3,4])

m = min(M.shape)

#We generate the matrix S (Sigma) from the singular values s
S[:m,:m] = np.diag(s)

#We calculate the pseudo-inverse of S
Sp = S.copy()

for m in range(0,m):
  Sp[m,m] = 1.0/Sp[m,m] if Sp[m,m] != 0 else 0

Sp = np.transpose(Sp)

Us = np.matrix(U).getH()
Vs = np.matrix(V).getH()

print "U:\n",U
print "V:\n",V
print "S:\n",S

print "U*:\n",Us
print "V*:\n",Vs
print "Sp:\n",Sp

#We obtain the solution to M*x = b using the singular-value decomposition of the matrix
print "numpy.linalg.svd solution:",np.dot(np.dot(np.dot(Vs,Sp),Us),b)

#This will print:
#numpy.linalg.svd solution: [[ 0.2         1.          0.66666667  0.          0.4       ]]

#We compare the solution to np.linalg.lstsq
x,residuals,rank,s =  np.linalg.lstsq(M,b)

print "numpy.linalg.lstsq solution:",x
#This will print:
#numpy.linalg.lstsq solution: [ 0.2         1.          0.66666667  0.          0.4       ]

#We determine the significant (i.e. non-arbitrary) components of x

Vs_significant = Vs[np.nonzero(s)]

print "Significant variables:",np.nonzero(np.sum(np.abs(Vs_significant),axis = 0))[1]

#This will print:
#Significant variables: [[0 1 2 4]]
#(i.e. x_3 can be chosen arbitrarily without altering the result)

Upvotes: 4

Related Questions