Reputation: 63
I m using SVM for classification, I have devided my data set into two CSV file one is training set (70 % of data) and other is testing set (30 % of data). when i use predict on the trainig set i m getting answer but on testing set it shows error I m using e1071 package
program as follow
Train <- read.csv("Train.csv")
Test <- read.csv("Test.csv")
x_Train <- subset(Train,select=-Class)
y_Train <- Train$Class
model <- svm(Class ~., data=Train)
pred=predict(model, x_Train) #working well
table(pred,y_Train)
x_Test <- subset(Test,select=-Class)
y_Test <- Test$Class
pred <- predict(model, x_Test) #getting_error
Error in scale.default(newdata[, object$scaled, drop = FALSE], center = object$x.scale$"scaled:center", :
length of 'center' must equal the number of columns of 'x'
Will you please figure out wat could be the problem...?
Upvotes: 3
Views: 2270
Reputation: 559
Ok, for those of you who had this error but none of these solutions worked like me: What I did was to increase the size of the test data marginally and it worked like a charm. The first time I had the error, I split the 2 sets 80-20, tried doing it 75-25 and worked just fine. I can't be sure why, but it worked.
Upvotes: 2
Reputation: 337
This is because the output has scale variables and those scale variables don't match the "newdata" variables.
Assume that you trained the SVM model for 5 variables called PC2: PC6
svm_model$x.scale
$`scaled:center`
PC2 PC3 PC4 PC5 PC6
5.445380e-16 2.507442e-16 -7.655441e-16 -5.730488e-16 -3.283584e-16
$`scaled:scale`
PC2 PC3 PC4 PC5 PC6
17.774403 13.571134 7.911114 6.541206 3.608903
In your newdata if the length of variables is >5 , you'll get this error. In your case x_Test <- subset(Test,select=-Class)
most likely changes the number of variables to scale.
Upvotes: 1
Reputation: 2927
If the class of a predictor in the train set is not the same as class of that same variable in the test set then you will run into this issue.
For example, if you trained a model with predictor variable x
with class(x) = numeric
and in the test set class(x) = character
then you should convert x
to numeric
before predicting:
data$x <- as.numeric(data$x)
That being said, it could be any class not strictly character
or numeric
, it could also be a factor
variable.
Upvotes: 1
Reputation: 2077
Remove the missing data in your test data or add na.action = na.omit in your prediction model. or you can use na.action = na.exclude
model <- svm(Class ~., data=Train, na.action = na.exclude)
Upvotes: 1