Reputation: 21
I'm trying to predict the class of four Data Deficient species using the predict() function in randomForest. I've run RF on my original data and created a RF object, and I then want to use this to predict the class of the new data.
The code I am using is:
# original data set "procellminvar"
# DD sp only "procelldd"
#run RF on original data set
procellminvar$current.red.list<-factor(procellminvar$current.red.list)
procell6<-procellminvar[,6:80]
procell6.imputed<-rfImpute(current.red.list~.,procell6)
procellminvar.rf<-randomForest(current.red.list~., procell6.imputed, votes=true, importance=TRUE, ntree=1000)
round(importance(procellminvar.rf),2)
#run prediction using original data and new data (DD sp only)
predict(procellminvar.rf, procelldd)
The RF runs fine, but when I try and run predict I get an error message:
predict(procellminvar.rf, procelldd)
# Error in eval(expr, envir, enclos) : object 'subpop' not found
I don't understand why. Could anybody explain to me in simple terms what I'm doing wrong here?
Upvotes: 2
Views: 2225
Reputation: 2481
I think the problem is that you're running the predict on the full dataset but you are not using the full dataset in the training. Nor are you using the original variables. So you need to make sure that each variable you are using in the training also is present in the test data.
Upvotes: 1