edmond
edmond

Reputation: 245

R caret (svmRadial) keep sigma constant and use grid search for C

I am implementing a Support Vector Machine with Radial Basis Function Kernel ('svmRadial') with caret. As far as I understand the documentation and the source code, caret uses an analytical formula to get reasonable estimates of sigma and fix it to that value (According to the output: Tuning parameter 'sigma' was held constant at a value of 0.1028894). In addition, caret cross-validates over a set of cost parameters C (default = 3).

However, if I now want to set my own grid of cost parameters (tuneGrid), I have to additionally specify a value of sigma. Otherwise the following error appears:

Error: The tuning parameter grid should have columns sigma, C

How can I fix Sigma based on the analytical formula and still implement my own grid of cost parameters C?

Here is a MWE:

library(caret)
library(mlbench)

data(BostonHousing)

set.seed(1)
index <- sample(nrow(BostonHousing),nrow(BostonHousing)*0.75)
Boston.train <- BostonHousing[index,]
Boston.test <- BostonHousing[-index,]

# without tuneGrid
set.seed(1)
svmR <- train(medv ~ .,
              data = Boston.train,
              method = "svmRadial",
              preProcess = c("center", "scale"),
              trControl = trainControl(method = "cv", number = 5))

# with tuneGrid (gives the error message)
set.seed(1)
svmR <- train(medv ~ .,
              data = Boston.train,
              method = "svmRadial",
              preProcess = c("center", "scale"),
              tuneGrid = expand.grid(C = c(0.01, 0.1)),
              trControl = trainControl(method = "cv", number = 5))

Upvotes: 3

Views: 5324

Answers (1)

StupidWolf
StupidWolf

Reputation: 46968

If you look under the information for the model, it shows how the grid is generated if you don't provide:

getModelInfo("svmRadial")$svmRadial$grid

function(x, y, len = NULL, search = "grid") {
                    sigmas <- kernlab::sigest(as.matrix(x), na.action = na.omit, scaled = TRUE)
                    if(search == "grid") {
                      out <- expand.grid(sigma = mean(as.vector(sigmas[-2])),
                                         C = 2 ^((1:len) - 3))
                    } else {
                      rng <- extendrange(log(sigmas), f = .75)
                      out <- data.frame(sigma = exp(runif(len, min = rng[1], max = rng[2])),
                                        C = 2^runif(len, min = -5, max = 10))
                    }
                    out
                  }

So the method to get it is to estimate the sigma using kernlab::sigest, First we pull out the grid method for svmRadial:

models <- getModelInfo("svmRadial", regex = FALSE)[[1]]

Set up the input x and y since you are providing a formula:

preProcValues = preProcess(Boston.train, method = c("center", "scale")) 
processData = predict(preProcValues,Boston.train)
x = model.matrix(medv ~ .,data=processData)[,-1]
y = processData$medv

And we use the grid function for this model, which you can see is the same as your output:

set.seed(1)
models$grid(x,y,3)

      sigma    C
1 0.1028894 0.25
2 0.1028894 0.50
3 0.1028894 1.00

svmR$results
      sigma    C     RMSE  Rsquared      MAE    RMSESD RsquaredSD     MAESD
1 0.1028894 0.25 5.112750 0.7591398 2.982241 0.8569208 0.05387213 0.4032354
2 0.1028894 0.50 4.498887 0.8046234 2.594059 0.7823051 0.05357678 0.3644430
3 0.1028894 1.00 4.055564 0.8349416 2.402248 0.8403222 0.06825159 0.3732571

And this is what happens underneath:

set.seed(1)
sigmas = kernlab::sigest(as.matrix(x), na.action = na.omit, scaled = TRUE)
# from the code, you can see it takes the mean of the two extreme quantiles

mean(sigmas[-2])
[1] 0.1028894

Upvotes: 3

Related Questions