Reputation: 453
I am attempting to make predictions using a trained SVM from package e1071 but my data contains some missing values (NA).
I would like the returned predictions to be NA when that instance has any missing values. I tried to use na.action = na.pass as below but it gives me an error "Error in names(ret2) <- rowns : 'names' attribute [150] must be the same length as the vector [149]".
If I use na.omit then I can get predictions without instances with missing data. How can I get predictions including NAs?
library(e1071)
model <- svm(Species ~ ., data = iris)
print(length(predict(model, iris)))
tmp <- iris
tmp[1, "Sepal.Length"] <- NA
print(length(predict(model, tmp, na.action = na.pass)))
Upvotes: 6
Views: 3322
Reputation: 2719
You can take advantage of the fact that the predict
output includes the original row numbers in the names()
attribute:
tmp[names(predict(model,tmp)),"predict"] = predict(model,tmp)
Upvotes: 0
Reputation: 93813
You could just assign all the valid cases back to a prediction variable in the tmp
set:
tmp[complete.cases(tmp), "predict"] <- predict(model, newdata=tmp[complete.cases(tmp),])
tmp
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species predict
#1 NA 3.5 1.4 0.2 setosa <NA>
#2 4.9 3.0 1.4 0.2 setosa setosa
#3 4.7 3.2 1.3 0.2 setosa setosa
# ...
Upvotes: 1
Reputation: 4082
if you are familiar with the caret package, where you can use 233 different types of models to fit (Including SVM from package e1071), in the section called "models clustered by tag similarity" there you can find a csv with the data they used to group the algorithms.
There is a column there called Handle Missing Predictor Data, which tells you which algorithms can do what you want. Unfortunately SVM is not included there, but these algorithms are:
If you still insist on using SVM, you could use the knnImpute option in the preProccess function from the same package, that should allow you to predict for all your observations.
Upvotes: 4