Reputation: 141
I'm trying to use SUBCLU in ELKI, but in order to figure things out I've tried DBSCAN, and even KMEANSLloyd, just so I know how to input data with high dimensions. Unfortunately I can only enter up to 14 Dimensions, any higher and the program starts complaining that I've not entered a parameter for "bubble.scaling", even when I quite clearly have. I'm inputting the data by using a .csv file formatted in a similar fashion to the "mouse.csv" tutorial file (which is how I figured out how to enter data with dimensions higher than 1 in the first place). What am I doing wrong?
Upvotes: 1
Views: 178
Reputation: 56
I had the same problem. Im my case it turned out that my csv file contained only integer columns, which were seen as string data type instead of numeric data type. By setting the dbc.parser to CategoricalDataAsNumberVectorParser, the outofbounds error disappeared.
Upvotes: 0
Reputation: 141
Turns out I wasn't formatting the CSV file properly. Rather than having a CSV file with just the data in it seperated by spaces for dimensionality, I needed to also include the headers. As I wasn't using randomly generated information and I didn't know the number of clusters beforehand, this is what the CSV looked like.
## Size: 10
########################################################
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 13 12 11 10 9 8 7 6 5 4 3 2 1
14 13 12 11 10 9 8 7 6 5 4 3 2 1
Upvotes: 1