Reputation: 21
I have a sample of more than 50 million observations. I estimate the following model in R:
model1 <- feglm(rejection~ variable1+ variable1^2 + variable2+ variable3+ variable4 | city_fixed_effects + year_fixed_effects, family=binomial(link="logit"), data=database)
Based on the estimates from model1, I calculate the marginal effects:
mfx2 <- marginaleffects(model1)
summary(mfx2)
This line of code also calculates the marginal effects of each fixed effects which slows down R. I only need to calculate the average marginal effects of variables 1, 2, and 3. If I separately, calculate the marginal effects by using mfx2 <- marginaleffects(model1, variables = "variable1") then it does not show the standard error and the p-value of the average marginal effects.
Any solution for this issue?
Upvotes: 2
Views: 2467
Reputation: 17805
Both the fixest
and the marginaleffects
packages have made recent
changes to improve interoperability. The next official CRAN releases
will be able to do this, but as of 2021-12-08 you can use the
development versions. Install:
library(remotes)
install_github("lrberge/fixest")
install_github("vincentarelbundock/marginaleffects")
I recommend converting your fixed effects variables to factors before fitting your models:
library(fixest)
library(marginaleffects)
dat <- mtcars
dat$gear <- as.factor(dat$gear)
mod <- feglm(am ~ mpg + mpg^2 + hp + hp^3| gear,
family = binomial(link = "logit"),
data = dat)
Then, you can use marginaleffects
and summary
to compute average
marginal effects:
mfx <- marginaleffects(mod, variables = "mpg")
summary(mfx)
## Average marginal effects
## type Term Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 %
## 1 response mpg 0.3352 40 0.008381 0.99331 -78.06 78.73
##
## Model type: fixest
## Prediction type: response
Note that computing average marginal effects requires calculating a distinct marginal effect for every single row of your dataset. This can be computationally expensive when your data includes millions of observations.
Instead, you can compute marginal effects for specific values of the
regressors using the newdata
argument and the typical
function.
Please refer to the marginaleffects
documentation for details on
those:
marginaleffects(mod,
variables = "mpg",
newdata = typical(mpg = 22, gear = 4))
## rowid type term dydx std.error hp mpg gear predicted
## 1 1 response mpg 1.068844 50.7849 146.6875 22 4 0.4167502
Upvotes: 4