ensure same N across nested multinomial models

Question

I'm running a set of nested models and need to ensure all models have the same N and use the same observations.

In Stata, I could run the model with the smallest N, generate a flag of the observations used in the model using e(sample) command, and then use that as a filter when I run the regression. Does anyone know who to do this in R? Using tidy() or augment() doesn't seem to work for multinomial models using the multinom package.

Thanks for the help. Sample code using palmerpenguins package:

install.packages("palmerpenguins")
library(palmerpenguins)

head(penguins)
View(penguins)

# models 
multinom_islands <- list(
  "model 1"  = nnet::multinom(island ~ bill_length_mm + bill_depth_mm,
                              data = penguins, trace = FALSE, na.action = "na.exclude"),
  "model 2"  = nnet::multinom(island ~ bill_length_mm + bill_depth_mm + sex + year, 
                              data = penguins, trace = FALSE, na.action = "na.exclude"))


# tables 
modelsummary(multinom_islands, group = y.level + term ~ model, 
             exponentiate = TRUE, 
             stars = c("*"=.05, "**"=.01, "***"=0.001), 
             title = "Nested multinomial regression models predicting island",
             output = "markdown")

Vincent · Accepted Answer

The solution you describe is Stata would not be guaranteed to work unless the variables in the “small” model are a subset of the variables in the “large” model.

One way to be sure you get the expected results is to begin by creating a new dataset with all the variables from all models. Then, we use the na.omit function to eliminate all missing data. Finally, we estimate the models. This does not guarantee the largest possible sample size, but it guarantees that all models will have the same number of observations:

library(modelsummary)
library(palmerpenguins)
library(nnet)

variables <- c("island", "bill_length_mm", "bill_depth_mm", "sex", "year")
dat <- penguins[, variables]
dat <- na.omit(dat)

mod <- list(
  multinom(island ~ bill_length_mm + bill_depth_mm,
           data = dat, trace = FALSE),
  multinom(island ~ bill_length_mm + bill_depth_mm + sex + year,
           data = dat, trace = FALSE))

modelsummary(mod, group = y.level + term ~ model, output = "markdown")

		Model 1	Model 2
Dream	(Intercept)	-16.087	0.008
		(2.340)	(0.000)
	bill_length_mm	0.008	0.096
		(0.027)	(0.032)
	bill_depth_mm	0.899	1.219
		(0.104)	(0.116)
	sexmale		-2.202
			(0.286)
	year		-0.012
			(0.001)
Torgersen	(Intercept)	-9.346	-0.006
		(3.297)	(0.000)
	bill_length_mm	-0.209	-0.149
		(0.047)	(0.049)
	bill_depth_mm	0.966	1.133
		(0.157)	(0.155)
	sexmale		-1.302
			(0.171)
	year		-0.007
			(0.002)
	Num.Obs.	333	333
	AIC	482.9	458.4
	edf	6.000	10.000

ensure same N across nested multinomial models

Answers (1)

Related Questions