Sindhu Viswanathan
Sindhu Viswanathan

Reputation: 13

Random Forest Predict

I have a training dataset which has 40,000 rows and I was able to successfully generate a randomForest for my dataset.

I am trying to now use it to predict on my test set. My training and test data sets are 2 different data frames and the columns names between the 2 data frames do not match. I am getting an an error in when I run my predict.

library(randomForest)
set.seed(2018)
new_train_rf= randomForest(workdf.V1~.,data = new_train_df, mtry=6, ntree=25)
new_train_rf
summary(new_train_rf)

Test Dataset predictions:

test_pred = predict(new_train_rf, newdata=new_test_df)
test_pred
summary(test_pred)

Error in eval(predvars, data, env) : object 'Var57' not found

Column Names in Test Dataframe:

testdf.Var218_UYBR, testdf2.Var6, testdf2.Var13, testdf2.Var21

Column Names in Training Datafame:

workdf.Var218_UYBR, tempdf2.Var6, tempdf2.Var13, tempdf2.Var21

Please help! I am new to R and I have been trying to figure out why my prediction is not working

Upvotes: 0

Views: 92

Answers (1)

Sindhu Viswanathan
Sindhu Viswanathan

Reputation: 13

I had my test and training data split into 2 different data frames. I was able to run my predictions successfully, after I renamed the columns in my df's (test and training) to match.

Upvotes: 0

Related Questions