Reputation: 345
I'm trying to display a plot with several coefficients, some are significant and some are not. Plus, when I try the other configuration of m1, an error is returned.
library("nycflights13")
library(dplyr)
library(dotwhisker)
library(MASS)
flights <- nycflights13::flights
flights<- sample_n (flights, 500)
m1<- glm(formula = arr_delay ~ dep_time + origin+ air_time+ distance , data = flights)
#m1<- glm(formula = arr_delay ~ . , data = flights)
m1<- stepAIC(m1)
summary(m1)
dwplot(m1)
dwplot(m1 + geom_vline(xintercept=0, lty=2)) ## This is meant to add a line on the CI
How can I dedicate different colors to coefficients with or without statistical significance?
EDIT 1 : This code works really great but when I change the paramter to 0.05 i get all results in orange as displayed. Any thoughts?
df <- mtcars
nested_inter <- mtcars %>% group_by(gear) %>%
nest() ## groups all the data by the sub series
nested_inter <- nested_inter %>%
mutate (model = map(data,
~lm(formula = mpg ~ cyl + drat + hp +wt , data = .)))
p<- dotwhisker::dwplot(nested_inter$model[[2]])
#print(p)
z<- p +
geom_vline(xintercept=0, linetype="dashed")+
geom_segment(aes(x=conf.low,y=term,xend=conf.high,
yend=term,col=p.value<0.05)) +
geom_point(aes(x=estimate,y=term,col=p.value<0.05)) +
xlab("standardized coefficient") +
ylab("coefficient") +
ggtitle("coefficients in the model and significance")
print(z)
Upvotes: 1
Views: 1448
Reputation: 47008
You can add the geom_vline argument outside the dwplot function, and to add colors, you have to specify them before hand and add them using dot_args=
and line_args
arguments. Unfortunately, i think you can only specify the color of the dots, the argument for the line doesn't work (at least in my hands).
First you can see the data is stored like this:
p = dwplot(m1)
p$data
# A tibble: 3 x 10
term estimate std.error statistic p.value conf.low conf.high by_2sd model
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <fct>
1 dep_… 28.0 4.18 6.71 5.54e-11 19.8 36.2 TRUE one
2 air_… 143. 30.0 4.76 2.55e- 6 84.0 201. TRUE one
3 dist… -143. 30.0 -4.78 2.33e- 6 -202. -84.5 TRUE one
# … with 1 more variable: y_ind <dbl>
So we just plot over, and assume something with p < 1e-06 is significant, making dep_time the only significant variable, so as to see the different colors:
p +
geom_vline(xintercept=0, linetype="dashed")+
geom_segment(aes(x=conf.low,y=term,xend=conf.high,
yend=term,col=p.value<1e-6))+
geom_point(aes(x=estimate,y=term,col=p.value<1e-6))
The other option is to do it from scratch using the actual coefficients from the model.
Upvotes: 2