user904522
user904522

Reputation: 21

Predict in Clustering

In R language is there a predict function in clustering like the way we have in classification? What can we conclude from the clustering graph result that we get from R, other that comparing two clusters?

Upvotes: 2

Views: 3930

Answers (2)

catastrophic-failure
catastrophic-failure

Reputation: 3905

Many packages offer predict methods for cluster object. One of such examples is clue, with cl_predict.

The best practice when doing this is applying the same rules used to cluster training data. For example, in Kernel K-Means you should compute the kernel distance between your data point and the cluster centers. The minimum determines cluster assignment (see here for example). In Spectral Clustering you should project your data point dissimilarity into the eigenfunctions of the training data, compare the euclidean distance to K-Means centers in that space, and a minimum should determine your cluster assignment (see here for example).

Upvotes: 1

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77505

Clustering does not pay attention to prediction capabilities. It just tries to find objects that seem to be related. That is why there is no "predict" function for clustering results.

However, in many situations, learning classifiers based on the clusters offers an improved performance. For this, you essentially train a classifier to assign the object to the appropriate cluster, then classify it using a classifier trained only on examples from this cluster. When the cluster is pure, you can even skip this second step.

The reason is the following: there may be multiple types that are classified with the same label. Training a classifier on the full data set may be hard, because it will try to learn both clusters at the same time. Splitting the class into two groups, and training a separate classifier for each, can make the task significantly easier.

Upvotes: 2

Related Questions