Reputation: 5028
I am wondering how do we know what space to transform the data into? What method is used to find an appropriate mapping function?
Suppose we have 10 training cases in 2-dimensional space that aren't linearly separable but if we transform them from F(X,Y): (x, y) -> (x, y^2) or G:(X, Y) -> (x, e^y) they are linearly separable.
How do we determine that the functions F and G will work in the first place? Through observation? And then how do we decide which function to use?
Upvotes: 0
Views: 61
Reputation: 3823
"How do we determine that the functions F and G will work in the first place? Through observation?"
Pretty much... At this moment as far as I can tell, there are not known conditions that would allow you to guarantee the linear separability after the mapping.
"And then how do we decide which function to use?"
Some functions are easier to tweak than others. For instance, RBFs are very suitable and are known to fit to most data., but everything has a catch so if your data is unbounded you will lose generality. So it is a matter of tradeoff
Upvotes: 1
Reputation: 2960
The SVM does the mapping into a different space by way of a kernel function. So to train the SVM, you don't need to map the data into a linearly separable place, simply train an SVM with the appropriate (non linear) kernel. It can learn linearly inseparable functions given the right kernel. Try rbf, as well as polynomial kernel. Also play with the C hyper parameter.
This may not be the answer you are looking for, but a lot of machine learning is try it and see. In this case, the polynomial kernel seems appropriate given what you state, but rbf tends to work well.
Upvotes: 1