Chuck C
Chuck C

Reputation: 153

how to plot logistic regression on the log odd scale using ggplot2

I am trying to plot the result of a logistic regression in the log odd scale.

load(url("https://github.com/bossaround/question/raw/master/logisticregressdata.RData"))

ggplot(D, aes(Year, as.numeric(Vote), color = as.factor(Male))) +
  stat_smooth( method="glm", method.args=list(family="binomial"), formula = y~x + I(x^2), alpha=0.5, size = 1, aes(fill=as.factor(Male))) +
  xlab("Year") 

but this plot is on a 0~1 scale. I guess this is the probability scale (correct me if I am wrong)?

What I really want is to plot it on a log odd scale just as the logistic regression reports before converting it to probability.

Ideally, I want to plot the relationship between Vote and Year, by Male, after controlling for Foreign, in a model like this:

   Model <-  glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)

I could manually draw the line based on summary(Model), but I also want to plot the confidence interval.

Something like the image on page 44 of this document I found online: http://www.datavis.ca/papers/CARME2015-2x2.pdf. Mine would have a quadratic curve.

Thank you!

Upvotes: 1

Views: 3908

Answers (3)

SUBODH SHARMA
SUBODH SHARMA

Reputation: 1

Your approach is correct but you need to predict the values with the model you have built something like this:

ModelPredictions <- predict(Model , type="response")

After that, you can plot using ggplot:

ggplot(D, aes(x=ModelPredictions , y=D$Vote )) +
  geom_point()  +  stat_smooth(method="glm", se=FALSE, method.args = list(family=binomial)) +  facet_wrap( ~ Foreign)

Upvotes: 0

missuse
missuse

Reputation: 19716

To plot the predictions of a model with several variables, one should make the model, predict on new data to generate predictions and plot that

Model <-  glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)
for_pred = expand.grid(Year = seq(from = 2, to = 10, by = 0.1), Male = c(0,1), Foreign = c(0,1)) #generate data to get a smooth line

for_pred = cbind(for_pred, predict(Model, for_pred, type = "link", se.fit= T)) 
#if the probability scale was needed: `type = "response`

library(ggplot2)
ggplot(for_pred, aes(Year, fit, color = as.factor(Male))) +
  geom_line() +
  xlab("Year")+
  facet_wrap(~Foreign)  + #important step - check also how it looks without it
  geom_ribbon(aes(ymax = fit + se.fit, ymin = fit - se.fit, fill = as.factor(Male)), alpha = 0.2) 

#omit the color by `color = NA` or by `inherit.aes = F` (if like this, one should provide the data and full `aes` mapping for  geom_ribbon). 
#If geom_ribbon should not have a mapping, specify `fill` outside of `aes` like: `fill = grey80`.

enter image description here

Check out library sjPlot

Upvotes: 3

Edu
Edu

Reputation: 903

A further answer using augmented() from broom():

Model <-  glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)
summary(Model)


# augmented data frame

model.df = augment(Model) %>% rename(log_odds = `.fitted`, 
                                       Sex = Male)

glimpse(model.df.1)
Observations: 46,398
Variables: 13
$ .rownames  <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21"...
$ Vote       <dbl> 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, ...
$ Year       <int> 2, 3, 4, 5, 2, 3, 2, 3, 4, 5, 6, 2, 2, 2, 3, 4, 2, 3, 4, 5, 6, 7, 8, 9, 2, 3, 4, 5, 2, 3, 2, 3, 4, 5, 2, 3, 4, 5, ...
$ I.Year.2.  <S3: AsIs>  4,  9, 16, 25,  4,  9,  4,  9, 16, 25, 36,  4,  4,  4,  9, 16,  4,  9, 16, 25, 36, 49, 64, 81,  4,  9, 16, 2...
$ Sex        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ Foreign    <int> 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ log_odds   <dbl> -0.01910985, -0.68184753, -1.14317053, -1.40307885, 0.26930939, -0.39342829, -0.01910985, -0.68184753, -1.14317053...
$ .se.fit    <dbl> 0.01675017, 0.01466136, 0.01790972, 0.02058514, 0.01931826, 0.01777691, 0.01675017, 0.01466136, 0.01790972, 0.0205...
$ .resid     <dbl> -1.1693057, -0.9047053, -0.7439452, 1.8016037, -1.2937083, 1.3483961, -1.1693057, -0.9047053, -0.7439452, -0.66303...
$ .hat       <dbl> 7.013561e-05, 4.794678e-05, 5.879536e-05, 6.711744e-05, 9.162739e-05, 7.602458e-05, 7.013561e-05, 4.794678e-05, 5....
$ .sigma     <dbl> 1.124879, 1.124884, 1.124886, 1.124861, 1.124876, 1.124874, 1.124879, 1.124884, 1.124886, 1.124887, 1.124860, 1.12...
$ .cooksd    <dbl> 1.376354e-05, 4.849628e-06, 3.749311e-06, 5.461011e-05, 2.399355e-05, 2.253792e-05, 1.376354e-05, 4.849628e-06, 3....
$ .std.resid <dbl> -1.1693467, -0.9047270, -0.7439671, 1.8016642, -1.2937676, 1.3484474, -1.1693467, -0.9047270, -0.7439671, -0.66305...


#visualise

        ggplot(model.df.1, aes(Year, log_odds, colour = Sex)) + 
        geom_line() + 
        geom_smooth(se = TRUE) +
       facet_wrap( ~ Foreign)

Which gives:

enter image description here

Upvotes: 2

Related Questions