user18723720
user18723720

Reputation: 21

Getting probability value greater than 1 from my glm model

I have created a logistic regression model in r to try to predict the outcome of cricket matches. However, my model produces probability values greater than 1. The output is 1.031704 Any tips on how I could improve my model to get an accurate estimation of probability?

set.seed(1)

#Use 70% of dataset as training set and remaining 30% as testing set
sample <- sample(c(TRUE, FALSE), nrow(ODIMT), replace=TRUE, prob=c(0.7,0.3))
train <- ODIMT[sample, ]
test <- ODIMT[!sample, ] 


model <- glm(Result~Target+Opposition+Country, family="binomial", data=ODIMT)

options(scipen=999)

summary(model)

pscl::pR2(model)["McFadden"]
caret::varImp(model)
car::vif(model)

new <- data.frame(Target = 226,Opposition = "v India", Country = "England")
predict(model, new, type="response")

Result variable is 1 or 0, Target is 0-400, and the other two are character variables.

data:

            Country    Target       Result Opposition    Ground
          England         NA          1     v India   Kolkata
          Australia      251          0  v Pakistan   Kolkata
          South Africa   168          0     v India     Delhi
          Bangladesh      NA          1  v Pakistan     Delhi
          England        306          0 v Australia Melbourne
          New Zealand     NA          1 v Sri Lanka Melbourne

Output of summary:

enter image description here

Upvotes: 1

Views: 1141

Answers (1)

s_pike
s_pike

Reputation: 2113

I think you are predicting the log-odds value. From the docs:

the type of prediction required. The default is on the scale of the linear predictors; the alternative "response" is on the scale of the response variable. Thus for a default binomial model the default predictions are of log-odds (probabilities on logit scale) and type = "response" gives the predicted probabilities. The "terms" option returns a matrix giving the fitted values of each term in the model formula on the linear predictor scale.

As noted in the comments, if you use type="response" you get the predicted probabilities.

Have a look at this question for more info.

Upvotes: 1

Related Questions