London guy
London guy

Reputation: 28022

NULL values across a dimension in Support Vector Machine

I am designing a support vector machine considering n dimensions. Along every dimension, the values could range from [0-1]. Now, if I am unable to determine the value across a particular dimension from the original data set, for a particular data point due to various reasons, what should the value along that dimension be for the SVM? Can I just put it as [-1] indicating a missing value?

Thanks Abhishek S

Upvotes: 0

Views: 787

Answers (1)

Godeke
Godeke

Reputation: 16281

You would be better served leaving the missing value out altogether if the dimension won't be able to contribute to your machine's partitioning of the space. This is because the only thing the SVM can do is place zero weight on that dimension as far as classification power, as all of the points in that dimension are at the same place.

Thus each pass over that dimension is just wasted computational resources. If recovering this value is of importance, you may be able to use a regression model of some type to try to get estimated values back, but if that estimated value is generated from your other data, yet again it won't actually contribute to your SVM because the data in that estimated dimension is nothing more that a summary of the data you used to generate it (which I would assume would be in your SVM model already).

Upvotes: 0

Related Questions