wander
wander

Reputation: 157

cluster analysis? label the cluster

I am quite confused about following two problems: I have a 15 dimensional dataset which should be used to cluster how many types of attacks are contained in the dataset.

1. now i have already clustered my dataset into 5 clusters (5 attacks). Does anyone know how can i point which cluster is which attack? (how to label the clusters not by just "cluster 1,cluster 2...")

2. In supervised classification, we have training dataset and testing dataset, and the testing is conducted with the classifier built from traning dataset. My question is, can the same approach be used for clustering. Like building a model with clustering algorithm, and then automatically classify the new instance into a specific cluster? Is this achievable?

Upvotes: 4

Views: 1526

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77485

How should an unsupervised method be able to identify named attacks?

The human-assigned name is not in the data!

For some clustering algorithms you can assign new instances automatically, but in general you cannot (not without knowing the model used by the clustering). In the worst case, a new observation would even e.g. merge two clusters into one. What are you going to do then?

If you want classification, use classification, not clustering.

Clustering has a very different mind-set. If you approach it from a classification point of view, you will not really understand it. You use clustering for finding something unknown in data, classification for generalizing something known to new data.

If necessary, you can also train a classifier on your cluster. But don't do this blindly. First make sure that the clusters actually are something useful. It's much easier to come up with a completely meaningless clustering result than with a good clustering. Training a classifier on worthless clusters won't produce a meaningful output.

Upvotes: 5

Related Questions