Why does sklearn.decomposition.PCA.fit_transform(X) no multiplication by X?

Question

As far as I understood PCA are in general the following steps:

calculate Co-Variance Matrix Σ = (1/m) * (X*X')
Apply singular value decomposition on Σ: U,S,V = SVD(Σ)
Take first k columns of U to reduce to k dimensions: U_reduced = U[:,k]
X_reduced = U_reduced' * X

X_reduced is X reduced to k dimensions.

But when i had a look into the implementation of SKLearn I found this line of code:

U *= S[:self.n_components_]

And this U is returned as transformed X. Why is using S instead of X still valid?

user6655984 · Accepted Answer

Your understanding of 1-2 is incorrect. PCA can be implemented either by finding the eigenvectors of the covariance matrix or by applying SVD to the centered data matrix. You don't do both covariance and SVD. In practice, the SVD approach is preferable because the computation of covariance matrix amplifies numerical issues associated with poorly conditioned matrices. And SKLearn uses it; here is the core of the method:

self.mean_ = np.mean(X, axis=0)
X -= self.mean_
U, S, V = linalg.svd(X, full_matrices=False)

SVD represents X as U @ S @ V.T (using @ for matrix multiplication, and assuming real-valued data). Here V consists of the eigenvectors of the covariance matrix, and satisfies the orthogonality relation V.T @ V = I

In terms of the eigenvectors V, the transformed data is X @ V. But since X is equal to U @ S @ V.T, multiplying both sides by V results in X @ V being equal to U @ S. So, U @ S is the transformed data.

It's easier to multiply something by S, which is diagonal, than by X, which is an arbitrary dense matrix.

For more, see Relationship between SVD and PCA. How to use SVD to perform PCA?

Why does sklearn.decomposition.PCA.fit_transform(X) no multiplication by X?

Answers (1)

Related Questions