Santino
Santino

Reputation: 815

Classifier for data with varying dimensionality

I need to train a classifier with data whose dimensionality can vary. For example (and this is made-up date for illustration):

class-1,0,1,2,3
class-2,0,3,2,4,5,7
class-3,1,8,8,8,2,8,0,0,0
:
:
and so on...

I am trying to train a Linear SVM using scikit-learn which requires the dimensionality to be fixed. A simple zero-padding of the smaller dims to match the dim of the largest, is giving me disappointing results.

Should I be using a different classifier for such data? How should I approach this?

Upvotes: 0

Views: 93

Answers (2)

shirowww
shirowww

Reputation: 573

Feature hashing is the algorithm you need to use to convert your variable-length input into constant-length input. Then, you could use your transformed vectors with any appropiate learning algorithm.

Wikipedia: Feature Hashing

Upvotes: 1

klubow
klubow

Reputation: 126

Try padding with feature mean/median, that's another way to deal with missing data. Are those measurements made in the same points/features ?

Upvotes: 1

Related Questions