Reputation: 3
Stupid question, but I want to be sure.
glm(outcome ~ cva, family = "binomial", data=df, x=TRUE, y=TRUE)
Predictors Odds p
(Intercept) 0.16 <0.001
cvaTRUE 1.95 0.029
My 'outcome' variable is YES or NO as.factor
How can I be sure this glm is giving the Odds of YES and not NO? ie I want to be confident that this is saying "if cva TRUE then Odds 1.95 for outcome YES.
Upvotes: 0
Views: 708
Reputation: 11306
If outcome
is a factor with levels "NO"
and "YES"
and cva
is a logical vector, then
coef(glm(outcome ~ cva, family = binomial, data = df))
shows you (per unit changes in) log odds of "YES"
rather than "NO"
if and only if "NO"
is the first element of levels(outcome)
. This requirement is documented in ?family
:
For the binomial and quasibinomial families the response can be specified in one of three ways:
- As a factor: ‘success’ is interpreted as the factor not having the first level (and hence usually of having the second level).
If you find that outcome
is not coded this way, then do
df$outcome <- relevel(df$outcome, "NO")
to replace outcome
in df
with a semantically equivalent factor whose first level is "NO"
.
FWIW, here is one way to check that glm
behaves as documented in your use case:
## Simulated data set
set.seed(1L)
n <- 100L
df <- data.frame(
outcome = factor(sample(0:1, size = n, replace = TRUE), levels = 0:1, labels = c("NO", "YES")),
cva = sample(c(FALSE, TRUE), size = n, replace = TRUE)
)
## Contingency table
tt <- table(df$outcome, df$cva)
## Sample odds ratio
r <- (tt["YES", "TRUE"] / tt["NO", "TRUE"]) / (tt["YES", "FALSE"] / tt["NO", "FALSE"])
## Estimated odds ratio when first level is "NO"
m0 <- glm(outcome ~ cva, family = binomial, data = df)
r0 <- exp(coef(m0))[[2L]]
## Reciprocal estimated odds ratio when first level is "YES"
m1 <- glm(relevel(outcome, "YES") ~ cva, family = binomial, data = df)
r1 <- exp(-coef(m1))[[2L]]
print(c(r, r0, r1), digits = 20, width = 30)
[1] 0.85565476190476186247
[2] 0.85565476230266901414
[3] 0.85565476230266912516
Upvotes: 0
Reputation: 6802
With a binary response taking the value of either 0 or 1, the model estimates the odds that outcome
is equal to 1. So, if YES is coded as 1 then you can be sure that the odds of 1.95 are for outcome YES.
Upvotes: 1