Sumit Waghmare
Sumit Waghmare

Reputation: 63

SVM Is working on Training set but not on testing set in R

I m using SVM for classification, I have devided my data set into two CSV file one is training set (70 % of data) and other is testing set (30 % of data). when i use predict on the trainig set i m getting answer but on testing set it shows error I m using e1071 package

program as follow

Train <- read.csv("Train.csv")
Test <- read.csv("Test.csv")

x_Train <- subset(Train,select=-Class)
y_Train <- Train$Class

model <- svm(Class ~., data=Train)


pred=predict(model, x_Train) #working well
table(pred,y_Train)  


 x_Test <- subset(Test,select=-Class)
 y_Test <- Test$Class

pred <- predict(model, x_Test) #getting_error

Error in scale.default(newdata[, object$scaled, drop = FALSE], center =         object$x.scale$"scaled:center",  : 
length of 'center' must equal the number of columns of 'x'

Will you please figure out wat could be the problem...?

Upvotes: 3

Views: 2270

Answers (4)

rookieJoe
rookieJoe

Reputation: 559

Ok, for those of you who had this error but none of these solutions worked like me: What I did was to increase the size of the test data marginally and it worked like a charm. The first time I had the error, I split the 2 sets 80-20, tried doing it 75-25 and worked just fine. I can't be sure why, but it worked.

Upvotes: 2

Yash
Yash

Reputation: 337

This is because the output has scale variables and those scale variables don't match the "newdata" variables.

Assume that you trained the SVM model for 5 variables called PC2: PC6

svm_model$x.scale
$`scaled:center`
          PC2           PC3           PC4           PC5           PC6           
 5.445380e-16  2.507442e-16 -7.655441e-16 -5.730488e-16 -3.283584e-16 

$`scaled:scale`
      PC2       PC3       PC4       PC5       PC6       
17.774403 13.571134  7.911114  6.541206  3.608903  

In your newdata if the length of variables is >5 , you'll get this error. In your case x_Test <- subset(Test,select=-Class) most likely changes the number of variables to scale.

Upvotes: 1

spies006
spies006

Reputation: 2927

If the class of a predictor in the train set is not the same as class of that same variable in the test set then you will run into this issue.

For example, if you trained a model with predictor variable x with class(x) = numeric and in the test set class(x) = character then you should convert x to numeric before predicting:

data$x <- as.numeric(data$x)

That being said, it could be any class not strictly character or numeric, it could also be a factor variable.

Upvotes: 1

MFR
MFR

Reputation: 2077

Remove the missing data in your test data or add na.action = na.omit in your prediction model. or you can use na.action = na.exclude

model <- svm(Class ~., data=Train, na.action = na.exclude)

Upvotes: 1

Related Questions