Reputation: 15
I have installed spark on AWS Elastic Map Reduce(EMR) and have been running SVM using the packages in MLLib. But there are no options to choose parameters for building the model like kernel selection and cost of misclassification (Like in e1071 package of R). Can someone please tell me how to set these parameters while building the model?
Upvotes: 1
Views: 571
Reputation: 156
MLLib's implementation of SVM is limited to linear kernels, so you're not going to find anything related to kernels. There is some work related to this happening, though, for example Pegasos.
Upvotes: 0
Reputation: 63082
Summary / TL;DR:
The hardcoded methods for SVMWithSGD are:
private val gradient = new HingeGradient()
private val updater new SquaredL2Updater()
Since these are hard-coded - you can not configure them the way you are used to in R.
Details:
At the "bare metal" level the mllib SVMWithSGD supports the following parameters:
There are other convenience methods that allow you to define:
You will notice that the two items you mention:
are not included in those configurable parameters
Under the covers these are defined by the invocation of GradientDescent class as follows:
* @param gradient Gradient function to be used.
* @param updater Updater to be used to update weights after every iteration.
GradientDescent(gradient: Gradient, private var updater: Updater)
with the following settings
Upvotes: 1