Reputation: 4879
I am not sure here if I am doing something wrong or this is a bug in confint
function in R
itself but I am getting confidence intervals for regression estimate which don't contain the estimate.
Here is reprex
:
# model (converting all numeric columns in data to z-scores)
mod <-
stats::lm(
formula = cbind(mpg, disp) ~ wt,
data = purrr::modify_if(.x = mtcars, .p = is.numeric, .f = scale)
)
# tidy dataframe
broom::tidy(mod, conf.int = TRUE)
#> # A tibble: 4 x 8
#> response term estimate std.error statistic p.value conf.low conf.high
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 mpg (Intercept) 8.05e-17 0.0893 9.01e-16 1.00e+ 0 -0.182 0.182
#> 2 mpg wt -8.68e- 1 0.0908 -9.56e+ 0 1.29e-10 -1.05 -0.682
#> 3 disp (Intercept) -1.70e-16 0.0826 -2.06e-15 1.00e+ 0 -0.182 0.182
#> 4 disp wt 8.88e- 1 0.0840 1.06e+ 1 1.22e-11 -1.05 -0.682
confint(mod)
#> 2.5 % 97.5 %
#> :(Intercept) -0.1824544 0.1824544
#> :wt -1.0530332 -0.6822855
#> :(Intercept) -0.1824544 0.1824544
#> :wt -1.0530332 -0.6822855
This becomes more conspicuous if you plot the estimates:
Am I doing something wrong here? Or this is expected behavior?
Upvotes: 2
Views: 314
Reputation: 46898
confint() does work with mlm now:
confint(lm(mpg~wt,data=mtcars))
2.5 % 97.5 %
(Intercept) 33.450500 41.119753
wt -6.486308 -4.202635
confint(lm(disp~wt,data=mtcars))
2.5 % 97.5 %
(Intercept) -204.0914 -58.2054
wt 90.7579 134.1984
confint(lm(cbind(mpg,disp)~wt,data=mtcars))
2.5 % 97.5 %
mpg:(Intercept) 33.450500 41.119753
mpg:wt -6.486308 -4.202635
disp:(Intercept) -204.091436 -58.205395
disp:wt 90.757897 134.198380
The problem is with using scale inside dplyr or purrr, I used as.data.frame after scale because all your columns are numeric anyway:
confint(lm(cbind(mpg,disp)~wt,data=as.data.frame(scale(mtcars))))
2.5 % 97.5 %
mpg:(Intercept) -0.1824544 0.1824544
mpg:wt -1.0530332 -0.6822855
disp:(Intercept) -0.1687740 0.1687740
disp:wt 0.7165054 1.0594545
For dplyr and purrr, you can see that the variable names are messed up too:
library(dplyr)
library(purrr)
confint(lm(cbind(mpg,disp)~wt,data=modify_if(.x = mtcars, .p = is.numeric, .f = scale)))
2.5 % 97.5 %
:(Intercept) -0.1824544 0.1824544
:wt -1.0530332 -0.6822855
:(Intercept) -0.1824544 0.1824544
:wt -1.0530332 -0.6822855
confint(lm(cbind(mpg,disp)~wt,data=mutate_if(mtcars,is.numeric,scale))
2.5 % 97.5 %
:(Intercept) -0.1824544 0.1824544
:wt -1.0530332 -0.6822855
:(Intercept) -0.1824544 0.1824544
:wt -1.0530332 -0.6822855
I am guessing when you do scale, the attributes that are carried forward messes with something in confint (still looking at the code.) One way you can solve this is to force the output to a vector (using c or as.numeric):
confint(lm(cbind(mpg,disp)~wt,data=modify_if(mtcars,is.numeric,~c(scale(.x)))))
or
confint(lm(cbind(mpg,disp)~wt,data=mutate_if(mtcars,is.numeric,~c(scale(.x)))))
2.5 % 97.5 %
mpg:(Intercept) -0.1824544 0.1824544
mpg:wt -1.0530332 -0.6822855
disp:(Intercept) -0.1687740 0.1687740
disp:wt 0.7165054 1.0594545
Unfortunately, the tidy(..,conf.int=TRUE)
does not quite work, so for plotting you need to use the confint
from stats with some manipulation.
Upvotes: 3