Josh
Josh

Reputation: 141

Using ELKI, having troubles with dimensions higher than 14

I'm trying to use SUBCLU in ELKI, but in order to figure things out I've tried DBSCAN, and even KMEANSLloyd, just so I know how to input data with high dimensions. Unfortunately I can only enter up to 14 Dimensions, any higher and the program starts complaining that I've not entered a parameter for "bubble.scaling", even when I quite clearly have. I'm inputting the data by using a .csv file formatted in a similar fashion to the "mouse.csv" tutorial file (which is how I figured out how to enter data with dimensions higher than 1 in the first place). What am I doing wrong?

Upvotes: 1

Views: 178

Answers (2)

mark79
mark79

Reputation: 56

I had the same problem. Im my case it turned out that my csv file contained only integer columns, which were seen as string data type instead of numeric data type. By setting the dbc.parser to CategoricalDataAsNumberVectorParser, the outofbounds error disappeared.

Upvotes: 0

Josh
Josh

Reputation: 141

Turns out I wasn't formatting the CSV file properly. Rather than having a CSV file with just the data in it seperated by spaces for dimensionality, I needed to also include the headers. As I wasn't using randomly generated information and I didn't know the number of clusters beforehand, this is what the CSV looked like.

## Size: 10
########################################################
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 13 12 11 10 9 8 7 6 5 4 3 2 1
14 13 12 11 10 9 8 7 6 5 4 3 2 1

Upvotes: 1

Related Questions