Reputation: 12683
Using randomForest
, I get an RF object.
E.g. forest <- randomForest(as.formula(generic),data=train, mtry=2)
)
Using predict
I can predict the response on a test dataset.
The response is either A,B or C.
prediction <- predict(forest, newdata=test, type='class')
mytable <- table(test$class_w,prediction)
sum(mytable[row(mytable) != col(mytable)]) / sum(mytable)#show error
Calling the forest object I get the confusion matrix:
A B C class.error
A 498 79 170 0.3333333
B 115 353 237 0.4992908
C 96 99 967 0.1678141
E.g test dataset:
id |class_w| valueA | valueB |
1 | C | 0.254 | 0.334 |
2 | A | 0.654 | 0.334 |
3 | A | 0.554 | 0.314 |
4 | B | 0.454 | 0.224 |
5 | C | 0.354 | 0.332 |
6 | C | 0.264 | 0.114 |
7 | C | 0.264 | 0.664 |
I would like to know if I can create a new dataset with 2 columns the id of the previous dataset and the predicted response (the RF gave). E.g.
row id of test dataset | predicted response
1 | A #failed
2 | B #failed
3 | B #failed
4 | B #TRUE!
Thanks in advance for your help.
Upvotes: 1
Views: 1256
Reputation: 9
An another way to do that would be to just write something like this:
yourNewDataSet$someNewColumnCreated= Predict(forest,yourNewDataSet,type="class")
This should give you a new column in your new dataset named 'someNewColumnCreated'
that will contain all the prediction of your model when applied to this new data set.
Upvotes: 1
Reputation: 173577
I think you may simply be looking to create a new data frame:
data.frame(id = test$id,response = prediction)
That assumes that id
is in fact a column in test
, rather than the row names. If they are rownames, then you'd want to do:
data.frame(id = rownames(id),response = prediction)
Upvotes: 3