Radhakrishnan
Radhakrishnan

Reputation: 266

Use k-mode output to predict

I have performed k-mode clustering on categorical variables for historical data. I did clustering because I wanted to see what clusters the data falls into. Now that I have the output, if and when a new data comes in, is there any way where I can predict the cluster that it will fall into.

One way might be, since I have the data for each row and the cluster that it falls into I can use it as train data and do a supervised learning. But I want to know whether any possible method exists where I will be able to use the existing output variable to predict (sort of semi supervised learning)

I may not be able to share any data or output since I am working for a client, but any direction on how to approach will be highly helpful. I have been researching about it for quite sometime now but couldn't find a suitable solution.

Upvotes: 1

Views: 1333

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77464

Most clustering algorithms cannot predict for new data.

KMeans and GMM are exceptions, and k-modes should work like k-means (find the most similar mode).

But usually, when you use clustering, you really should analyze the clusters and double-check this, as clusterings just don't get 100% right. Usually, you'll want some clusters from run A, some from run B etc. Whatever makes sense. Then train a classifier on the reviewed, cleaned up clusters for prediction.

Upvotes: 2

Related Questions