Saikat
Saikat

Reputation: 1219

PCA using SVD in OpenCV

I have a matrix M of m*n dimension. M contains n number of data each has m dimension and m is very very large than n.

Now my question is, how to compute or what are the steps or procedure to find PCA of M using SVD in OpenCV keeping only those eigenvectors containing 99% of total load or energy ?

Upvotes: 2

Views: 3688

Answers (2)

lightalchemist
lightalchemist

Reputation: 10219

You need to first compute the covariance matrix C from your data matrix M. You can either use OpenCV's calcCovarMatrix function or simply compute C = (M - mu)' x (M - mu) where I assumed that your data samples are stored as rows in M and mu is the mean of your data samples and A' is the matrix A transposed.

Next, perform SVD on C to get USU' = SVD(C), where U' is U transposed. In this case V' from SVD is the same as U' because C is symmetric and positive definite (if C is full rank) or semidefinite if it is rank deficient. U contains the eigenvectors of C.

What you want to do is to keep k number of eigenvectors i.e. the k number of columns(or rows? You got to check OpenCV docs whether it returns the eigenvectors as rows or columns) of U whose corresponding singular values in the matrix S corresponds to the k largest singular values AND their sum divided by the sum of all the singular values are >= 0.99. Basically the singular values here corresponds to the variances for each corresponding feature in your feature vectors and you keep the top k that retains 0.99 i.e. 99% of the variance/energy.

These eigenvectors packed together into a matrix, say Uk, is your PCA bases. Because these eigenvectors also happen to be orthogonal to each other, the transpose of Uk, Uk', is the projection matrix. To get the dimension-reduced point of a new test sample x, simply compute x_reduced = Uk'*(x - mu);

Upvotes: 4

Roger Rowland
Roger Rowland

Reputation: 26279

Generally, for PCA (i.e. not specific to OpenCV), you would start with a covariance matrix. So in your case, the input would be an m*m square matrix formed by the component-wise variances of your original samples.

Then you do an eigenvector decomposition on the (very large) square symmetric matrix, and can extract the topmost eigenvectors you require. Use the corresponding eigenvalues to determine your percentage variance coverage.

If the scale of your original variables is not similar - i.e. you didn't normalise your data - you can use a correlation matrix instead of a covariance matrix.

For PCA using OpenCV, Google gives some very usefule examples

Upvotes: 2

Related Questions