PCA using SVD in OpenCV

Question

I have a matrix M of m*n dimension. M contains n number of data each has m dimension and m is very very large than n.

Now my question is, how to compute or what are the steps or procedure to find PCA of M using SVD in OpenCV keeping only those eigenvectors containing 99% of total load or energy ?

lightalchemist · Accepted Answer

You need to first compute the covariance matrix C from your data matrix M. You can either use OpenCV's calcCovarMatrix function or simply compute C = (M - mu)' x (M - mu) where I assumed that your data samples are stored as rows in M and mu is the mean of your data samples and A' is the matrix A transposed.

Next, perform SVD on C to get USU' = SVD(C), where U' is U transposed. In this case V' from SVD is the same as U' because C is symmetric and positive definite (if C is full rank) or semidefinite if it is rank deficient. U contains the eigenvectors of C.

What you want to do is to keep k number of eigenvectors i.e. the k number of columns(or rows? You got to check OpenCV docs whether it returns the eigenvectors as rows or columns) of U whose corresponding singular values in the matrix S corresponds to the k largest singular values AND their sum divided by the sum of all the singular values are >= 0.99. Basically the singular values here corresponds to the variances for each corresponding feature in your feature vectors and you keep the top k that retains 0.99 i.e. 99% of the variance/energy.

These eigenvectors packed together into a matrix, say Uk, is your PCA bases. Because these eigenvectors also happen to be orthogonal to each other, the transpose of Uk, Uk', is the projection matrix. To get the dimension-reduced point of a new test sample x, simply compute x_reduced = Uk'*(x - mu);

PCA using SVD in OpenCV

Answers (2)

Related Questions