Furqan Hashim
Furqan Hashim

Reputation: 1318

Is standardized scaling a pre-requisite for applying PCA using sklearn?

I have a set of 70 input variables on which I need to perform PCA. As per my understanding centering data such that for each input variable mean is 0 and variance is 1, is necessary for applying PCA.

I am having a hard time figuring it out that do I need to perform standard scaling preprocessing.StandardScaler()before passing my data set to PCA or PCA function in sklearn does it on its own.

If latter is the case then irrespective of if I do, or do not apply preprocessing.StandardScaler() the explained_variance_ratio_ should be the same.

But the results are different, hence I believe preprocessing.StandardScaler() is necessary before applying PCA. Is it true?

Upvotes: 7

Views: 4741

Answers (1)

hellpanderr
hellpanderr

Reputation: 5906

Yes, it' true, scikit-learn's PCA does not apply standardization to the input dataset, it only centers it by subtracting the mean.

See also this post.

Upvotes: 9

Related Questions