Bazman
Bazman

Reputation: 2150

Difference in Matlab results when using PCA() and PCACOV()

Closest match I can get is to run:

  data=rand(100,10);  % data set
  [W,pc] = pca(cov(data));

then don't demean

  data2=data
  [W2, EvalueMatrix2] = eig(cov(data2));
  [W3, EvalueMatrix3] = svd(cov(data2)); 

In this case W2 and W3 agree and W is the transpose of them?

Still not clear why W should be the transpose of the other two?

As an extra check I use pcacov:

   [W4, EvalueMatrix4] = pcacov(cov(data2));

Again it agrees with WE and W3 but is the transpose of W?

Upvotes: 0

Views: 1129

Answers (1)

user20160
user20160

Reputation: 1394

The results are different because you're subtracting the mean of each row of the data matrix. Based on the way you're computing things, rows of the data matrix correspond to data points and columns correspond to dimensions (this is how the pca() function works too). With this setup, you should subtract the mean from each column, not row. This corresponds to 'centering' the data; the mean along each dimension is set to zero. Once you do this, results should be equivalent to pca(), up to sign flips.

Edit to address edited question: Centering issue looks ok now. When you run the eigenvalue decomposition on the covariance matrix, remember to sort the eigenvectors in order of descending eigenvalues. This should match the output of pcacov(). When calling pca(), you have to pass it the data matrix, not the covariance matrix.

Upvotes: 2

Related Questions