Reputation: 23
I am currently working on clustering categorical attributes that come from a bank marketing dataset from Kaggle. I have created the three clusters with kmodes:
Output: cluster_df
Now I want to visualize each row of a cluster as a projection or point so that I get some kind of image:
I am having a hard time with this. I don't get a Euclidean distance with categorized data, right? That makes no sense. Is there then no possibility to create this desired visualization?
Upvotes: 1
Views: 406
Reputation: 334
The best way to visualize clusters is to use PCA. You can use PCA to reduce the multi-dimensional data into 2 dimensions so that you can plot and hopefully understand the data better. To use it see the following code:
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents
, columns = ['principal component 1', 'principal component 2'])
where x is the fitted and transformed data on your cluster. Now u can easily visualize your clustered data since it's 2 dimensional.
Upvotes: 1