Reputation: 188
I am trying to run K-Means using ELKI MiniGUI. I have a CSV dataset of 15 features (columns) and a label column. I would like to do multiple runs of K-Means with different combinations of the feature columns.
Is there anywhere in the MiniGUI where I can specify the indeces of which columns I would like to be used for clustering?
If not, what is the simplest way to achieve this by changin/extending ELKI in Java?
Upvotes: 0
Views: 81
Reputation: 8725
This is obivously easily achievable with Java code, or simply by preprocessing the data as necessary. Generate 10 variants, then launch ELKI via the command line.
But there is a filter to select columns: NumberVectorFeatureSelectionFilter
. To only use columns 0,1,2 (in the numeric part; labels are treated separately at this point; this is a vector transformation):
-dbc.filter transform.NumberVectorFeatureSelectionFilter
-projectionfilter.selectedattributes 0,1,2
The filter could be extended using our newer IntRangeParameter to allow for specifications such as 1..3,5..8; but this has not been implemented yet.
Upvotes: 1