dery143
dery143

Reputation: 11

ELKI GUI no clustering results for Hierarchical clustering

I'm new to ELKI and I need to do some basic clustering of a dataset that I already tested and clustered in Weka. I'm using the "GUI version" and I read the tutorial Analyzing the "mouse" data set on ELKI site: http://elki.dbs.ifi.lmu.de/wiki/Tutorial#Analyzingthemousedataset

I clustered my dataset with EM and successfully visualized and output the results (from the tutorial I just changed the parameter resultHandler: ResultWriter). The results I got in the folder are are: cluster.txt, cluster-evaluation.txt and settings.txt.

I have problems with the output results for hierarchical algorithms (SLINK,CLINK, etc.). The output that I got is just the settings.txt, but I need the cluster.txt.

I need to change some other parameters, because on the log view there are no errors?

Upvotes: 1

Views: 382

Answers (1)

Erich Schubert
Erich Schubert

Reputation: 8725

To get partitions from a hierarchical clustering result, you also need to specify a cluster extraction method:

-algorithm clustering.hierarchical.extraction.HDBSCANHierarchyExtraction
-algorithm CLINK
-hdbscan.minclsize 50

Note that we have two -algorithm parameters now, and order is important. The extraction algorithm has a "nested" algorithm call to do the actual hierarchical clustering.

CLINK clustering result

On the long run, we want to move to an operator-based approach (in particular for GUIs). For the command line, the nested-invocation is more safe, as you cannot attempt to extract without running a hierarchical clustering.

As for CLINK, the cluster quality is usually not too good (it also is order dependent, so shuffling the data and running multiple times will give different results). I'd also give AGNES or Anderberg with complete linkage a try; AGNES is always O(n^3), Anderberg is usually in O(n^2) (only worst case is O(n^3)) and both produce much better results (they are expected to produce the same results except for tied distances, CLINK is different):

Complete-Link clustering with Anderberg algorithm

Upvotes: 1

Related Questions