Regression with new factor levels in test set - how to gracefully ignore error

Question

Is there anyway for R to 'gracefully' ignore errors that would normally completely crash the prediction when there are new factor levels in the test set? Normally if there is just 1 bad value the entire operation doesnt' work

So that the predictions would occur were there are valid values, but when there are new factor levels an error would occur?

really crappy example but... here is what I'm getting at

  library(randomForest)
  df=mtcars
  df$vs=99
  df[1,8]=0  # vs column
  df$vs=factor(df$vs)
  mtcars$vs=factor(mtcars$vs)

  fit=lm(mpg~., data=mtcars)
   # fit above works with explanation given below, but fit2 fails with randomforest?  why?
  fit2 = randomForest(mpg~., data=mtcars)
   df$help=predict(fit, df)   #  first row should work others should error gracefully maybe with a NA?

First response I got has been great. However, it still fails for a less simplistic example with randomForest above.

Regression with new factor levels in test set - how to gracefully ignore error

Answers (1)

Related Questions