Number of components on PCA limited by the number of samples

I'm using sklearn to do PCA, I'm testing the functions with some dummy data, when I have more samples than the number of components I want to use it works just fine:

from sklearn.decomposition import PCA
import numpy as np    

features_training = np.random.rand(10,30)
components = 8
pca = PCA(n_components=int(components))
X_pca = pca.fit_transform(features_training)

From the code above I get a 10*8 matrix.

X_pca.shape
(10, 8)

But for the same data, if I try to keep 15 components:

features_training = np.random.rand(10,30)
components = 15
pca = PCA(n_components=int(components))
X_pca = pca.fit_transform(features_training)

I don't get a 10*15 matrix but a 10*10 one.

X_pca.shape
(10, 10)

So it seems that the number of components is limited not only by the number of features but for the number of samples. Why is that?

Upvotes: 0

Views: 549

Answers (1)

Vivek Kumar
Vivek Kumar

Reputation: 36619

I cannot tell you about how actually the PCA works. But in the Scikit-learn documentation for PCA, it is mentioned that actual n_components = min(n_samples, specified n_components)

Upvotes: 1

Related Questions