Classifier for data with varying dimensionality

Question

I need to train a classifier with data whose dimensionality can vary. For example (and this is made-up date for illustration):

class-1,0,1,2,3
class-2,0,3,2,4,5,7
class-3,1,8,8,8,2,8,0,0,0
:
:
and so on...

I am trying to train a Linear SVM using scikit-learn which requires the dimensionality to be fixed. A simple zero-padding of the smaller dims to match the dim of the largest, is giving me disappointing results.

Should I be using a different classifier for such data? How should I approach this?

shirowww · Accepted Answer

Feature hashing is the algorithm you need to use to convert your variable-length input into constant-length input. Then, you could use your transformed vectors with any appropiate learning algorithm.

Wikipedia: Feature Hashing

Classifier for data with varying dimensionality

Answers (2)

Related Questions