Reputation: 13
I have a training dataset which has 40,000 rows and I was able to successfully generate a randomForest
for my dataset.
I am trying to now use it to predict on my test set. My training and test data sets are 2 different data frames and the columns names between the 2 data frames do not match. I am getting an an error in when I run my predict.
library(randomForest)
set.seed(2018)
new_train_rf= randomForest(workdf.V1~.,data = new_train_df, mtry=6, ntree=25)
new_train_rf
summary(new_train_rf)
Test Dataset predictions:
test_pred = predict(new_train_rf, newdata=new_test_df)
test_pred
summary(test_pred)
Error in eval(predvars, data, env) : object 'Var57' not found
Column Names in Test Dataframe:
testdf.Var218_UYBR, testdf2.Var6, testdf2.Var13, testdf2.Var21
Column Names in Training Datafame:
workdf.Var218_UYBR, tempdf2.Var6, tempdf2.Var13, tempdf2.Var21
Please help! I am new to R and I have been trying to figure out why my prediction is not working
Upvotes: 0
Views: 92
Reputation: 13
I had my test and training data split into 2 different data frames. I was able to run my predictions successfully, after I renamed the columns in my df's (test and training) to match.
Upvotes: 0