Reputation: 11
I need to create possible combinations of 3 dummy variables into one categorical variable in a logistic regression using R. I made the combination manually just like the following:
new_variable_code | variable_1 | variable_2 | variable_3 |
---|---|---|---|
1 | 0 | 0 | 0 |
2 | 0 | 1 | 0 |
3 | 0 | 1 | 1 |
4 | 1 | 0 | 0 |
5 | 1 | 1 | 0 |
6 | 1 | 1 | 1 |
I excluded the other two options (0 0 1) and (1 0 1) because I do not need them, they are not represented by the data. I then used new_variable_code as a factor in the logistic regression along with other predictors.
My question is: Is there is any automated way to create the same new_variable_code? or even another econometric technique to encode the 3 dummy variables into 1 categorical variable inside a logistic regression model?
My objective: To understand which variable combination has the highest odds ratio on the outcome variable (along with other predictors explained in the same model).
Thank you
Upvotes: 1
Views: 601
Reputation: 263301
I would just create a variable with paste using sep="." and make it a factor:
newvar <- factor( paste(variable_1, variable_2, variable_3, sep="."))
I don't think it would be a good idea to then make it a sequential vlaue, it's already an integer with levels, since that's how factors get created.
Upvotes: 1
Reputation: 1298
You could use pmap_dbl
in the following way to recode your dummy variables to a 1-6 scale:
library(tidyverse)
# Reproducing your data
df1 <- tibble(
variable_1 = c(0,0,0,1,1,1),
variable_2 = c(0,1,1,0,1,1),
variable_3 = c(0,0,1,0,0,1)
)
factorlevels <- c("000","010","011","100","110","111")
df1 <- df1 %>%
mutate(
new_variable_code = pmap_dbl(list(variable_1, variable_2, variable_3),
~ which(paste0(..1, ..2, ..3) == factorlevels))
)
Output:
# A tibble: 6 x 4
variable_1 variable_2 variable_3 new_variable_code
<dbl> <dbl> <dbl> <dbl>
1 0 0 0 1
2 0 1 0 2
3 0 1 1 3
4 1 0 0 4
5 1 1 0 5
6 1 1 1 6
Upvotes: 1