hjkim
hjkim

Reputation: 25

Doing PCA and Whitening with matlab

My task is to do PCA and whitening transform with given 2dimentional 5000data.

What I understand with PCA is analyzing the main axis of the data with covariance Matrix's Eigen Vector and rotate the main axis to the x axis!

So here's what I did.

[BtEvector,BtEvalue]=eig(MYCov);% Eigen value and vector using built-in function

I first calculated eigen values and vectors. The result was

BtEvalue=[4.027487815706757,0;0,8.903923357227459] 

and

BtEvector=[0.033937679569230,-0.999423951036524;-0.999423951036524,-0.033937679569230]

So I figured out that the main axis will have eigen value of 8.903923357227459 and eigen vector of [-0.999423951036524,-0.033937679569230] which is the second corresponding term.

After then, because it's two dimentional data, I let cos(theta) as -0.9994.. and sin(theta)=-0.033937. Because I thought the main axis of the data(eigen vector [-0.999423951036524,-0.033937679569230]) has to be x axis I made rotational axis R= [cos(-Theta)-sin(-theta);sin(-theta) cos(-theta)]. Let original data sets A=>2*5000, I did A*R to get rotated data.

Also, For whitening case, using Cholesky whitening, I made whitening transformation matrix as inv(Covariance Matrix).

Is there something wrong with my algorithm? Could someone testify if there's error or misunderstanding please? Thank you a lot in advance.

Upvotes: 1

Views: 1001

Answers (1)

idnavid
idnavid

Reputation: 1996

Since your data is two-dimensional, the covariance matrix that you calculated is not accurate. If you only calculate the covariance with respect to one axis (say x), you're assuming that the covariance along the y axis is identity. This is obviously not true. Although you've attempted to address this, there's a sound procedure that you can use (I've explained below).

Unfortunately, this is a common mistake. Have a look at this paper, where it is explained exactly how the covariance should be calculated.

In summary, you can calculate the covariance along each axis (Sx and Sy). Then approximate the 2D covariance of the vectorized matrix as kron(Sx,Sy). This will be a better approximation of the 2D covariance.

Upvotes: 1

Related Questions