Reputation: 11
I need some help here. I've got the following table explaining workplace turnover:
..............B(hat) se
Gender -0.01 0.55
Age -0.01 0.03
Job satisfaction -0.12 0.08
So, workplace turnover is coded as 1 for those who change jobs, and job satisfaction is measured on an interval scale coded from 0 = very unsatisfied to 10 = very satisfied.
I am supposed to make a 95% confidence interval for job satisfaction. However, I am not quite sure what to do. I have made this so far:
c(-0.12 - qnorm(0.975) * 0.08, -0.012 + qnorm(0.975) * 0.08)
From this I get the following result: [1] -0.2767971 0.1447971
I am not quite sure whether this is the correct way.
Does anyone have any input that can help me? :)
Thanks in advance!
Upvotes: 1
Views: 533
Reputation: 72813
Your code works fine (apart from the typo) if you (naïvely) assume a normal distribution (i.e. a t-distribution with infinite degrees of freedom).
all.equal(qnorm(1 - .05/2), qt(1 - .05/2, df=Inf))
# [1] TRUE
`colnames<-`(t(apply(d, 1, function(x)
x[1] + x[2]*(qt(1 - .05/2, df=Inf)*c(-1, 1)))), paste0(c(2.5, 97.5), "%"))
# 2.5% 97.5%
# Gender -1.08798019 1.06798019
# Age -0.06879892 0.04879892
# Job satisfaction -0.27679712 0.03679712
However, you have n=112 observations, m=3 coefficients and k=1 constant, thus n - m - k degrees of freedom. Hence using the t-distribution with 108 degrees of freedom might be the better choice.
(DOF <- 112 - 3 - 1)
# [1] 108
qt(1 - .05/2, df=DOF)
# [1] 1.982173
`colnames<-`(t(apply(d, 1, function(x)
x[1] + x[2]*(qt(1 - .05/2, df=DOF)*c(-1, 1)))), paste0(c(2.5, 97.5), "%"))
# 2.5% 97.5%
# Gender -1.1001954 1.08019542
# Age -0.0694652 0.04946520
# Job satisfaction -0.2785739 0.03857388
For a complete summary you could add t-statistics and p-values
signif(cbind(d, t=d[,1]/d[,2], p=2*pt(-abs(d[,1]/d[,2]), df=DOF),
`colnames<-`(
t(apply(d, 1, function(x)
x[1] + x[2]*(qt(1 - .05/2, df=DOF)*c(-1, 1)))),
paste0(c(2.5, 97.5), "%"))),
2)
# B.hat. se t p 2.5% 97.5%
# Gender -0.01 0.55 -0.018 0.99 -1.100 1.100
# Age -0.01 0.03 -0.330 0.74 -0.069 0.049
# Job satisfaction -0.12 0.08 -1.500 0.14 -0.280 0.039
Data:
d <- structure(list(B.hat. = c(-0.01, -0.01, -0.12), se = c(0.55,
0.03, 0.08)), class = "data.frame", row.names = c("Gender", "Age",
"Job satisfaction"))
Upvotes: 2
Reputation: 951
I would use broom::tidy()
for this. Like so:
library(tidyverse)
library(broom)
data(mtcars)
head(mtcars)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
mod <- lm(mpg~hp,mtcars)
summary(mod)
#>
#> Call:
#> lm(formula = mpg ~ hp, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -5.7121 -2.1122 -0.8854 1.5819 8.2360
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 30.09886 1.63392 18.421 < 2e-16 ***
#> hp -0.06823 0.01012 -6.742 1.79e-07 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 3.863 on 30 degrees of freedom
#> Multiple R-squared: 0.6024, Adjusted R-squared: 0.5892
#> F-statistic: 45.46 on 1 and 30 DF, p-value: 1.788e-07
broom::tidy(mod,conf.int = T)
#> # A tibble: 2 x 7
#> term estimate std.error statistic p.value conf.low conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 30.1 1.63 18.4 6.64e-18 26.8 33.4
#> 2 hp -0.0682 0.0101 -6.74 1.79e- 7 -0.0889 -0.0476
Created on 2021-01-03 by the reprex package (v0.3.0)
Welcome to StackOverflow!
Upvotes: 0