Reputation: 7288
Being relatively new to Weka I'm wondering if it's possibly to train a classifier based on a CSV file containing variable length rows of data. For example a CSV file that looked like the following:
1, 2, 3, 4, 3, 2, 1
1, 2, 4, 3, 2, 1
...
Whilst basic, both of these lines show a clear pattern. Will a Weka classifier work effectively with a CSV file that looked like this if it received a similar pattern?
Upvotes: 0
Views: 211
Reputation: 66795
In short - no, this is a difficult case which cannot be simply approached with defaul WEKA models. Such data requires either preprocessing in order to get the fixed length representations which WEKA can handle (which can have missing values) or using some more complex models which can work with such data. It looks like a time series, so you should look for tools/models that can work with it. I would suggest looking at DTW (Dynamic Time Warping) and classifiers that work with custom distance measure (for example KNN) instead of raw data representation.
Upvotes: 1
Reputation: 16104
No. you need to explicitly specify which feature is missing value. for example, if
1,2,3,4,3,2,1 is a row with all data; then
1,,2,4,3,2,1 is another row in which the 2nd feature is missing value.
Upvotes: 1