Reputation: 31
I'm trying to train an svm classifier to do prediction. When I try to use the trained model, I get this error: test data does not match model. I am not why this is happening. This is my code
# to prepare the training and testing data
dat = data.frame(x = rbind(tmp1, tmp2), y = as.factor(c(rep(1, 300), rep(-1, 300))))
set.seed(1)
train_ind = sample(seq_len(nrow(dat)), size = 500)
train = dat[train_ind, ]
test = dat[-train_ind, ]
# training and prediction
library('e1071')
svmfit = svm(y ~ ., data = train, kernel ='linear', cost = 10, scale = FALSE)
ypred = predict(svmfit, test)
table(predict=ypred, truth = test$y)
Upvotes: 1
Views: 2070
Reputation: 304
If you have a categorical predictor variable (independent variable) in your training dataset, only the categories present in training dataset can be present in test dataset. If it is the case, check if all categories in your test dataset are present into training dataset. Some times SVM assumes as categorical one a integer variable with a short range like the months represented as numbers [1:12]
Upvotes: 0
Reputation: 31
The reason behind this error is that I included the ids of the observations in the training and testing data which has confused the svm classifier. The ids of the observations are in the first column. So when I removed the first column from the training and testing, it worked.
Upvotes: 1