Issues plotting a fitted SVM model's decision boundary using ggplot2's stat_contour()

Question

I'm trying to figure out how to plot a decision boundary for a fitted svm model in ggplot2. Right now, I'm attempting to do so by using stat_contour. Here is my code with an example call to my function at the end. You can find the data files I'm using on my github page:

train.txt

test.txt

train <- read.table('train.txt', col.names = c('digit', 'intensity', 'symmetry'))
test <- read.table('test.txt', col.names = c('digit', 'intensity', 'symmetry'))    

        digits.SVM <- function(train, test, digits = c(1, 5), C = 0.01, kernel = 'radial', degree = 3, gamma = 1, coef0 = 0, scale = FALSE, type = 'C-classification', plotApproximation = FALSE) {
              library(e1071)
              library(ggplot2)
              library(reshape2)
              if(length(digits) != 1 && length(digits) != 2)
                stop('Invalid length of digits vector.  Must specify one or two digits to classify')
              if(length(digits) == 2) {
                train <- train[(train$digit == digits[1]) | (train$digit == digits[2]), ]
                test <- test[(test$digit == digits[1]) | (test$digit == digits[2]), ]
              }

              train$class <- -1
              test$class <- -1
              train[train$digit == digits[1], ]$class <- 1
              test[test$digit == digits[1], ]$class <- 1
              fit <- svm(class~intensity + symmetry, data = train, cost = C, kernel = kernel, degree = degree, gamma = gamma, coef0 = coef0, scale = scale, type = type)
              class_fitted <- predict(fit, train[c('intensity', 'symmetry')])

              gridRange <- apply(train[c('intensity', 'symmetry')], 2, range)
              x1 <- seq(from = gridRange[1, 1] - 0.025, to = gridRange[2, 1] + 0.025, length = 75)
              x2 <- seq(from = gridRange[1, 2] - 0.05, to = gridRange[2, 2] + 0.05, length = 75)
              grid <- expand.grid(intensity = x1, symmetry = x2)
              grid$class <- predict(fit, grid)
              decisionValues <- predict(fit, grid, decision.values = TRUE)
              grid$z <- as.vector(attributes(decisionValues)$decision)
              print(colnames(grid))
              print(head(grid))
              p <- ggplot(data = grid, aes(intensity, symmetry, colour = as.factor(class))) + 
                geom_point(size = 1.5) +
                scale_fill_manual(values = c('red', 'black')) + 
                stat_contour(data = grid, aes(x = intensity, y = symmetry, z = z), breaks = c(0)) + 
                geom_point(data = train, aes(intensity, symmetry, colour = as.factor(class)), alpha = 0.7) +
                scale_colour_manual(values = c('red', 'black')) + labs(colour = 'Class') +
                scale_x_continuous(expand = c(0,0)) +
                scale_y_continuous(expand = c(0,0))
              print(p)
              mean(train$class != class_fitted)
            }

    digits.SVM(train, test, digits = c(0), kernel = 'polynomial', degree = 2, coef0 = 1)

My problem occurs when setting the "breaks" option in stat_contour(). Most of the values I set break to don't cause any issues; here is the plot that results when breaks = -1.

breaks = -1

However, the correct boundary corresponds to the contour that would result from setting breaks = 0, and as I set breaks closer to 0 the ggplot begins to have issues plotting the contour. It begins to cut off and at a value of exactly 0, it simply doesn't plot a contour at all.

Here is an example of the plot with breaks = -0.05: breaks = 0.05

As you can see, the boundary is beginning to cut off. Now here is the plot using breaks = 0:

The entire contour has been cut out.

I am also getting this error message:

Warning messages: 1: Not possible to generate contour data

I'm relatively new to ggplot2 and am not sure what stat_contour() does in the background. I tried to look for its implementation but had no luck. Any help and clarifications will be greatly appreciated!

I would also welcome any suggestions for better ways to accomplish this.

Issues plotting a fitted SVM model's decision boundary using ggplot2's stat_contour()

Answers (1)

Related Questions

Issues plotting a fitted SVM model&#39;s decision boundary using ggplot2&#39;s stat_contour()

Answers (1)

Related Questions

Issues plotting a fitted SVM model's decision boundary using ggplot2's stat_contour()