koocer
koocer

Reputation: 23

Perform Multi-Dimension Scaling (MDS) for clustered categorical data in python

I am currently working on clustering categorical attributes that come from a bank marketing dataset from Kaggle. I have created the three clusters with kmodes:

Output: cluster_df

Now I want to visualize each row of a cluster as a projection or point so that I get some kind of image:

Desired visualization

I am having a hard time with this. I don't get a Euclidean distance with categorized data, right? That makes no sense. Is there then no possibility to create this desired visualization?

Upvotes: 1

Views: 406

Answers (1)

egjlmn1
egjlmn1

Reputation: 334

The best way to visualize clusters is to use PCA. You can use PCA to reduce the multi-dimensional data into 2 dimensions so that you can plot and hopefully understand the data better. To use it see the following code:

from sklearn.decomposition import PCA
pca = PCA(n_components=2)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents
             , columns = ['principal component 1', 'principal component 2'])

where x is the fitted and transformed data on your cluster. Now u can easily visualize your clustered data since it's 2 dimensional.

Upvotes: 1

Related Questions