Why are the wrong clusters projected onto PCA using Sklearn?

Question

I am projecting my cluster centers onto 2 principal components, but the plot given is not in the correct central place of my 2 sets of data points. My code is given below. Does anyone see where I am going wrong? The PCA is fine, but one of the data points for the cluster is way off. I will mention that half of my centroid data points are negative. I have played around with inversing the pca transform and really am not sure where the error is coming from. Any help is greatly appreciated!

import numpy as np
import sklearn
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt


data = normalize(key) 

key_N=normal(key,key)

pca=PCA(n_components=21)
pca.fit(data[:,0:-1])
keytrain_T = pca.transform(data[:,0:-1])

kmeans = KMeans(n_clusters=2, init='k-means++', n_init=100, max_iter=300, 
            tol=0.0001, precompute_distances='auto', verbose=0, random_state=None, copy_x=True, n_jobs=1)
kmeans.fit(data[:,0:-1])

centroid = cluster_centers_
print("The centroids:",centroid)

# Project the cluster points to the two first principal components
clusters = pca.fit_transform(centroid)

print("The clusters:",clusters)

Why are the wrong clusters projected onto PCA using Sklearn?

Answers (1)

Related Questions