Vale Y
Vale Y

Reputation: 15

How to plot multiple variables from regression model in R?

I am trying to create a plot from the outputs of a logistic regression model where multiple plots are combined:

I have ran a logistic regression model on data which looks like this:

   gender english art science sports geography   insured 
1  Female       0   1       0      0         0         1
2  Female       1   1       0      1         1         1
3  Female       1   0       0      1         1         1
4  Female       1   0       0      0         1         1
5  Female       1   1       1      0         1         1
6  Female       1   1       1      0         0         0
7    Male       1   1       1      1         0         1
8    Male       1   1       1      1         0         0
9  Female       1   1       0      0         0         1
10   Male       1   1       0      0         1         0
11 Female       1   1       0      0         1         1

I have ran a logistic regression model on the data and created a plot of the output using the effects package: this is the code I used for that:

df_fit<- glm( insured ~ english +art+science + gender, data = df, family = 'binomial')

plot(Effect(focal.predictors = c("art",'gender'), df_fit), rug = FALSE)

This is what the plot looks like.effect plot

How can I adjust my code so that all the predicted glm outputs for the '1' valued variables english:science will appear on the left side, whilst all the predicted glm outputs for the '0' values of the variables english:science will be plotted on the right, and separated by gender?

I have tried using gather in dplyr to create a variable which combines all the english:science to create a long dataset but this causes errors in the regression model and disrupts the data.

Is there another way to plot this?

This is my desired output: enter image description here

Upvotes: 0

Views: 1044

Answers (1)

Miff
Miff

Reputation: 7941

You can do something like:

#Create a prediction data frame with each effect separated out
new_data <- data.frame(gender=rep(c("Female","Male"), each=5), english=c(1,0,0,0,0), art=c(1,0,0,0,0), 
                                  science=c(1,0,0,0,0), sport=c(1,0,0,0,0), geography=c(1,0,0,0,0),
                                  subject=c("english", "art", "science", "sports", "geography"))

#Predictions for the new data
fits <- predict(df_fit, newdata=new_data, type="response", se.fit=TRUE)[1:2]
new_data <- cbind(new_data, val=fits[[1]], se=fits[[2]])

In this case the predictions are made using a new data frame that for each of male and female has only one of the subjects in each case. These aren't strictly 'partial effects', they're the predictions for each of the cases

Using the power of ggplot to get the figure you're after:

#plot
library(ggplot2)
ggplot(new_data, aes(x=subject, y=val, ymin=val-se, ymax=val+se)) + geom_point() +geom_errorbar() + facet_wrap(~gender) +ylab("Partial effects (+/- 1 se)")

Output plot

Note that the actual partial effects (rather than the predictions from each effect) can be seen with:

predict(df_fit, newdata=new_data, type="terms", se.fit=TRUE)[1:2]

Upvotes: 0

Related Questions