Stakerauo
Stakerauo

Reputation: 11

Binomial logistic regression graph interpretation

I am working on a binomial logistic regression analysis with one categorical dependent variable, one continuous independent variable, and one indicator variable. I have run the regression, and created plots. I think I have done everything correctly, however, I am put off a little bit by the look of my plots. This is how my plots look (Note, the red dotted line is the indicator variable. It is not included in the code further down.):

Practical example

And this is how I was thought in school that they were going to look:

Academic example

Here is a reproducible sample:

sample = data.frame(AWO = sample(0:1,1000, T),
                    corn = rnorm(1000, 0, 1))

Here are the code I applied:

library(ggplot2)
ggplot(sample, aes(x=corn, y=AWO)) +
      geom_point(alpha = .25) +
      geom_smooth(method = "glm", 
                  method.args = list(family = "binomial"), 
                  se = FALSE)

I have also performed and plotted the regression manually and get the same results, for those interested:

mwc <- glm(AWO ~ corn, data = sample, family=binomial)
x0 = seq(min(sample$corn), max(sample$corn), length = 1000)
plot(sample$corn, sample$AWO)
pwc = predict(mwc, newdata = data.frame(corn = x0), type = "response")
lines(x0, pwc)

So my question is, have I plotted the regression wrong, or is it simply a case of academia v. practice?

Upvotes: 0

Views: 640

Answers (2)

Allan Cameron
Allan Cameron

Reputation: 173803

The probability of outcome just doesn't change much over the range of your data, so you only have a small section of the idealised curve. Let's take your example:

library(ggplot2)

set.seed(1)

sample <- data.frame(AWO = sample(0:1,1000, T),
                    corn = rnorm(1000, 0, 1))

myplot <- ggplot(sample, aes(x=corn, y=AWO)) +
   geom_point(alpha = .25) +
   geom_smooth(method = "glm", 
               method.args = list(family = "binomial"), 
               se = FALSE, fullrange = TRUE)

myplot
#> `geom_smooth()` using formula = 'y ~ x'

But now let's zoom out on the x axis:

myplot + xlim(c(-100, 100))
#> `geom_smooth()` using formula = 'y ~ x'

Created on 2023-05-15 with reprex v2.0.2

Upvotes: 1

George Savva
George Savva

Reputation: 5336

In your sample data (and presumably your real data) there is no relationship (or a very weak relationship) between corn and AWO. If you simulate some data that has AWO depending on corn then you get a graph like you expect:

sample = data.frame(corn = rnorm(1000, 0, 1))
sample$AWO <- rbinom(1000,1,  exp(sample$corn)/(1+exp(sample$corn))  )

ggplot(sample, aes(x=corn, y=AWO)) +
  geom_point(alpha = .25) +
  geom_smooth(method = "glm", 
              method.args = list(family = "binomial"))

enter image description here

Upvotes: 0

Related Questions