Reputation: 169
I'm running a set of nested models and need to ensure all models have the same N and use the same observations.
In Stata, I could run the model with the smallest N, generate a flag of the observations used in the model using e(sample) command, and then use that as a filter when I run the regression. Does anyone know who to do this in R? Using tidy()
or augment()
doesn't seem to work for multinomial models using the multinom
package.
Thanks for the help. Sample code using palmerpenguins
package:
install.packages("palmerpenguins")
library(palmerpenguins)
head(penguins)
View(penguins)
# models
multinom_islands <- list(
"model 1" = nnet::multinom(island ~ bill_length_mm + bill_depth_mm,
data = penguins, trace = FALSE, na.action = "na.exclude"),
"model 2" = nnet::multinom(island ~ bill_length_mm + bill_depth_mm + sex + year,
data = penguins, trace = FALSE, na.action = "na.exclude"))
# tables
modelsummary(multinom_islands, group = y.level + term ~ model,
exponentiate = TRUE,
stars = c("*"=.05, "**"=.01, "***"=0.001),
title = "Nested multinomial regression models predicting island",
output = "markdown")
Upvotes: 0
Views: 75
Reputation: 17725
The solution you describe is Stata would not be guaranteed to work unless the variables in the “small” model are a subset of the variables in the “large” model.
One way to be sure you get the expected results is to begin by creating a new dataset with all the variables from all models. Then, we use the na.omit
function to eliminate all missing data. Finally, we estimate the models. This does not guarantee the largest possible sample size, but it guarantees that all models will have the same number of observations:
library(modelsummary)
library(palmerpenguins)
library(nnet)
variables <- c("island", "bill_length_mm", "bill_depth_mm", "sex", "year")
dat <- penguins[, variables]
dat <- na.omit(dat)
mod <- list(
multinom(island ~ bill_length_mm + bill_depth_mm,
data = dat, trace = FALSE),
multinom(island ~ bill_length_mm + bill_depth_mm + sex + year,
data = dat, trace = FALSE))
modelsummary(mod, group = y.level + term ~ model, output = "markdown")
Model 1 | Model 2 | ||
---|---|---|---|
Dream | (Intercept) | -16.087 | 0.008 |
(2.340) | (0.000) | ||
bill_length_mm | 0.008 | 0.096 | |
(0.027) | (0.032) | ||
bill_depth_mm | 0.899 | 1.219 | |
(0.104) | (0.116) | ||
sexmale | -2.202 | ||
(0.286) | |||
year | -0.012 | ||
(0.001) | |||
Torgersen | (Intercept) | -9.346 | -0.006 |
(3.297) | (0.000) | ||
bill_length_mm | -0.209 | -0.149 | |
(0.047) | (0.049) | ||
bill_depth_mm | 0.966 | 1.133 | |
(0.157) | (0.155) | ||
sexmale | -1.302 | ||
(0.171) | |||
year | -0.007 | ||
(0.002) | |||
Num.Obs. | 333 | 333 | |
AIC | 482.9 | 458.4 | |
edf | 6.000 | 10.000 |
Upvotes: 1