Reputation: 6657
What command should I use in R to perform a confusion matrix after having used rpart()
and predict()
commands to generate a prediction model?
# Grow tree
library(rpart)
fit <- rpart(activity ~ ., method="class", data=train.data)
printcp(fit) # display the results
plotcp(fit) # visualize cross-validation results
summary(fit) # detailed summary of splits
# Prune the tree (in my case is exactly the same as the initial model)
pfit <- prune(fit, cp=0.10) # from cptable
pfit <- prune(fit,cp=fit$cptable[which.min(fit$cptable[,"xerror"]),"CP"])
# Predict using the test dataset
pred1 <- predict(fit, test.data, type="class")
# Show re-substitution error
table(train.data$activity, predict(fit, type="class"))
# Accuracy rate
sum(test.data$activity==pred1)/length(pred1)
I would like to summarise in a clear way True Positives, False Negatives, False Positives and True Negatives. It would be great also to have in the same matrix Sensitivity, Specificity, Positive Predictive Value and Negative Predictive Value.
Source: http://en.wikipedia.org/wiki/Sensitivity_and_specificity
Upvotes: 3
Views: 15856
Reputation: 1323
Use the predict()
method, with your fit and the original data frame, like so:
pred = predict(train.fit, newdata, type = "vector")
newdata$pred = as.vector(pred)
newdata$prediction = activities[newdata$pred]
tab = table (newdata$prediction, newdata$activity)
print(tab)
In the example above, the rpart model predicts an activity (a factor variable). pred
is numeric, with values corresponding to the levels of the factor. activities = sort(unique(data$activity))
corresponds to the default factor mapping.
Upvotes: 1