Reputation: 6080
Does anyone know how to implement the Principal component analysis (PCA) on a m-by-n matrix in matlab for normalization?
Upvotes: 2
Views: 483
Reputation: 6887
Assuming each column is a sample (that is, you have n
samples each of dimension m
), and it's stored in a matrix A
you first have to subtract off the column means:
Amm = bsxfun(@minus,A,mean(A,2));
then you want to do an eigenvalue decomposition on 1/size(Amm,2)*Amm*Amm'
(you can use 1/(size(Amm,2)-1)
as a scale factor if you want an interpetation as an unbiased covariance matrix) with:
[v,d] = eig(1/size(Amm,2)*Amm*Amm');
And the columns of v
are going to be your PCA vectors. The entries of d
are going to be your corresponding "variances".
However, if your m
is huge then this is not the best way to go because storing Amm*Amm'
is not practical. You want to instead compute:
[u,s,v] = svd(1/sqrt(size(Amm,2))*Amm,'econ');
This time u
contains your PCA vectors. The entries of s
are related to the entries of d
by a sqrt
.
Note: there's another way to go if m
is huge, i.e. computing eig(1/size(Amm,2)*Amm
'*Amm);
(notice the switch of transposes as compared to above) and doing a little trickery, but it's a longer explanation so I won't get into it.
Upvotes: 4