Reputation: 11

Construct 95% confidence interval for regression model

I need some help here. I've got the following table explaining workplace turnover:

..............B(hat) se

Gender -0.01 0.55

Age -0.01 0.03

Job satisfaction -0.12 0.08

So, workplace turnover is coded as 1 for those who change jobs, and job satisfaction is measured on an interval scale coded from 0 = very unsatisfied to 10 = very satisfied.

I am supposed to make a 95% confidence interval for job satisfaction. However, I am not quite sure what to do. I have made this so far:

c(-0.12 - qnorm(0.975) * 0.08, -0.012 + qnorm(0.975) * 0.08)

From this I get the following result: [1] -0.2767971 0.1447971

I am not quite sure whether this is the correct way.

Does anyone have any input that can help me? :)

Thanks in advance!

Upvotes: 1

Answers (2)

jay.sf

Reputation: 72813

Your code works fine (apart from the typo) if you (naïvely) assume a normal distribution (i.e. a t-distribution with infinite degrees of freedom).

all.equal(qnorm(1 - .05/2), qt(1 - .05/2, df=Inf))
# [1] TRUE

`colnames<-`(t(apply(d, 1, function(x) 
  x[1] + x[2]*(qt(1 - .05/2, df=Inf)*c(-1, 1)))), paste0(c(2.5, 97.5), "%"))
#                         2.5%      97.5%
# Gender           -1.08798019 1.06798019
# Age              -0.06879892 0.04879892
# Job satisfaction -0.27679712 0.03679712

However, you have n=112 observations, m=3 coefficients and k=1 constant, thus n - m - k degrees of freedom. Hence using the t-distribution with 108 degrees of freedom might be the better choice.

(DOF <- 112 - 3 - 1)
# [1] 108

qt(1 - .05/2, df=DOF)
# [1] 1.982173

`colnames<-`(t(apply(d, 1, function(x) 
  x[1] + x[2]*(qt(1 - .05/2, df=DOF)*c(-1, 1)))), paste0(c(2.5, 97.5), "%"))
#                        2.5%      97.5%
# Gender           -1.1001954 1.08019542
# Age              -0.0694652 0.04946520
# Job satisfaction -0.2785739 0.03857388

For a complete summary you could add t-statistics and p-values

signif(cbind(d, t=d[,1]/d[,2], p=2*pt(-abs(d[,1]/d[,2]), df=DOF),
             `colnames<-`(
               t(apply(d, 1, function(x) 
                 x[1] + x[2]*(qt(1 - .05/2, df=DOF)*c(-1, 1)))), 
               paste0(c(2.5, 97.5), "%"))), 
       2)
#                  B.hat.   se      t    p   2.5% 97.5%
# Gender            -0.01 0.55 -0.018 0.99 -1.100 1.100
# Age               -0.01 0.03 -0.330 0.74 -0.069 0.049
# Job satisfaction  -0.12 0.08 -1.500 0.14 -0.280 0.039

Data:

d <- structure(list(B.hat. = c(-0.01, -0.01, -0.12), se = c(0.55, 
0.03, 0.08)), class = "data.frame", row.names = c("Gender", "Age", 
"Job satisfaction"))

Upvotes: 2

Magnus Nordmo

Reputation: 951

I would use broom::tidy() for this. Like so:

library(tidyverse)
library(broom)
data(mtcars)

head(mtcars)
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

mod <- lm(mpg~hp,mtcars)

summary(mod)
#> 
#> Call:
#> lm(formula = mpg ~ hp, data = mtcars)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -5.7121 -2.1122 -0.8854  1.5819  8.2360 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 30.09886    1.63392  18.421  < 2e-16 ***
#> hp          -0.06823    0.01012  -6.742 1.79e-07 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 3.863 on 30 degrees of freedom
#> Multiple R-squared:  0.6024, Adjusted R-squared:  0.5892 
#> F-statistic: 45.46 on 1 and 30 DF,  p-value: 1.788e-07

broom::tidy(mod,conf.int = T)
#> # A tibble: 2 x 7
#>   term        estimate std.error statistic  p.value conf.low conf.high
#>   <chr>          <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
#> 1 (Intercept)  30.1       1.63       18.4  6.64e-18  26.8      33.4   
#> 2 hp           -0.0682    0.0101     -6.74 1.79e- 7  -0.0889   -0.0476

^{Created on 2021-01-03 by the reprex package (v0.3.0)}

Welcome to StackOverflow!

Upvotes: 0

Construct 95% confidence interval for regression model

Answers (2)

Related Questions