Reputation: 439
In machine learning, PCA is used to reduce the dimensionality of training data. However, from the above picture, I can't understand where is the reduction?
The input data x_i has D dimensions:
The output data x still has D dimensions:
Upvotes: -1
Views: 1009
Reputation: 4918
The important thing to understand while using PCA is the covariance matrix C(x)
and its corresponding spectral decomposition. The obtained eigenvalues
and eigenvector
of the decomposition is used to reduce the dimensionality.
For a D
dimensional training set, we have D
number of eigenvalues and their corresponding eigenvectors. But in practice (specially image related applications) many of the eigenvectors are correlated; in other words many of them are redundant basis vectors. So discarding those vector from the basis space doesn't result in significant information loss.
Now, if you want to reduce the dimension of your input data from original D
to d < D
dimension, you can project the input data into d
dominant eigenvectors (from the d
largest eigenvalues). Eq~29
gives the project input data into the d
dimensional space. Eq~30
is used to reconstruct the original data; here reconstruction errors depend on d
(number of eigenvectors)
Upvotes: 1
Reputation: 66795
The crucial element here is misunderstanding what is the output, in this pseudocode the output is y (equation 29), not x (equation 30), consequently you do reduce your data to d dimensions, the final equation shows you that if you would like to move back to original space, you can do it (obviously data will be recovered with some errors, since in meantime we dropped a lot of information when going to d dimensions).
Upvotes: 2