Reputation: 300
I am trying to do a scree plot for Kernel PCA. I have 78 features in my X with 247K samples. I am new to kernel PCA however I have utilized scree plot for linear PCA
multiple times. The below code does the scree plot for linear PCA. I want to use the scree plot to decide the number of components I will need before actually fitting it in.
pca = PCA().fit(X)
plt.figure()
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('Number of Principle Components')
plt.ylabel('Variance (%)') #for each component
plt.title('Dataset Explained Variance')
plt.show()
I tried to replicate the same way for kernel pca but explained_variance_ratio_
method doesn't exist for kernel PCA which is why I did it the following way.
pca = KernelPCA(kernel='rbf',gamma=10,fit_inverse_transform=False).fit_transform(scaled_merged.iloc[0:1000:,])
explained_variance = np.var(pca, axis=0)
explained_variance_ratio = explained_variance / np.sum(explained_variance)
plt.figure()
plt.plot(np.cumsum(explained_variance_ratio))
plt.xlabel('Number of Components')
plt.ylabel('Variance (%)') #for each component
plt.title('Dataset Explained Variance')
plt.show()
The scree plot for kernel PCA
code has some problem it shows that I need 150 components to express close to 90% variance. Is there something wrong I am doing with my code?
Upvotes: 0
Views: 1076
Reputation: 41
The reason is simple. The sum of eigenvalues in kernel PCA (kPCA) is associated to the total explained variance in the feature space, depending on your choice of kernel function. With an RBF kernel, kPCA is equivalent to classical PCA in the feature space of infinite dimension. The eigenvalues for kPCA are equivalent to those in the feature space. That's why the scree plot is different from those for PCA which corresponds to a linear kernel function. So, if you use a linear kernel, it should be the same as PCA in the input space.
The correct code for a fair comparison is
pca = KernelPCA(kernel='linear', fit_inverse_transform=False).fit_transform(scaled_merged.iloc[0:1000:,]
In short, the eigenvalues in kPCA is not supposed to be interpreted like classical PCA except for a linear kernel function. The optimization problem for kPCA is the dual problem of classical PCA in the feature space, not in the input space.
Reference:
Schölkopf, B., Smola, A., & Müller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural computation, 10(5), 1299-1319.
Upvotes: 2