Reputation: 556
I'm trying to do a 'classic' difference in difference with multiple time periods. The model I want to do is:
y = a + b1x1 + b2_treat + b3_period + b_4(treat*period) + u (eq.1)
So basically I'm testing different setups just to make sure I specify my model in the right way, using different packages. I want to use the fixest-package, so I tried to compare the estimates with the estimates from the standard lm()-package. The results, however, differ -- both coefficients and std.errors.
My question is:
If not, I would appreciate it if anyone can show me how to get the same results in lm() as in feols()!
# libraries
library(fixest)
library(modelsummary)
library(tidyverse)
# load data
data(base_did)
# make df for lm_mod with 5 as the reference-period
base_ref_5 <- base_did %>%
mutate(period = as.factor(period)) %>%
mutate(period = relevel(period, ref = 5))
# Notice that i use base_ref_5 for the lm model and base_did for the feol_mod.
lm_mod <- lm(y ~ x1 + treat*period, base_ref_5)
lm_mod2 <- lm(y ~ x1 + treat + period + treat*period, base_ref_5)
feols_mod <- feols(y ~ x1 + i(period, treat, ref = 5), base_did)
# compare models
models <- list("lm" = lm_mod,
"lm2" = lm_mod2,
"feols" = feols_mod)
msummary(models, stars = T)
**EDIT:**
the reason why I created base_ref_5 was so that both regressions would have period 5 as the reference period, if that was unclear.
**EDIT 2**:
added a third model (lm_mod2) which is much closer, but there is still a difference.
Upvotes: 1
Views: 1039
Reputation: 17868
There are two issues, here.
lm()
model, the period
variable is interacted, but treated as a continuous numeric variable. In contrast, calling i(period, treat)
treats period
as a factor (this is explained clearly in the documentation).i()
function only includes the interactions, and not the constitutive terms.Here are two models to illustrate the parallels:
library(fixest)
data(base_did)
lm_mod <- lm(y ~ x1 + factor(period) * factor(treat), base_did)
feols_mod <- feols(y ~ x1 + factor(period) + i(period, treat), base_did)
coef(lm_mod)["x1"]
#> x1
#> 0.9799697
coef(feols_mod)["x1"]
#> x1
#> 0.9799697
Please note that I only answered the part of your question about parallels between lm
and feols
. StackOverflow is a programming Q&A site. If you have questions about the proper specification of a statistical model, you might want to ask on CrossValidated.
Upvotes: 2