Reputation: 37
I tried computing confusion-matrix for my glm model but I keep getting:
Error:
data
andreference
should be factors with the same levels.
Below is my model:
model3 <- glm(winner ~ srs.1 + srs.2, data = train_set, family = binomial)
confusionMatrix(table(predict(model3, newdata=test_set, type="response")) >= 0.5,
train_set$winner == 1)
winner
variable contains team1
and team2
.
srs.1
and srs.2
are numerical values.
What is my problem here?
Upvotes: 3
Views: 138
Reputation: 47008
I suppose your winner
label is a binary of 0,1. So let's use the example below:
library(caret)
set.seed(111)
data = data.frame(
srs.1 = rnorm(200),
srs.2 = rnorm(200)
)
data$winner = ifelse(data$srs.1*data$srs.2 > 0,1,0)
idx = sample(nrow(data),150)
train_set = data[idx,]
test_set = data[-idx,]
model3 <- glm(winner ~ srs.1 + srs.2, data = train_set, family = binomial)
Like you did, we try to predict, if > 0.5, it will be 1 else 0. You got the table() about right. Note you need to do it both for test_set, or train_set:
pred = as.numeric(predict(model3, newdata=test_set, type="response")>0.5)
ref = test_set$winner
confusionMatrix(table(pred,ref))
Confusion Matrix and Statistics
ref
pred 0 1
0 12 5
1 19 14
Accuracy : 0.52
95% CI : (0.3742, 0.6634)
No Information Rate : 0.62
P-Value [Acc > NIR] : 0.943973
Kappa : 0.1085
Upvotes: 2