Kob
Kob

Reputation: 167

Different result using fixest and multiple fixed effects

First of all, I have to apologize if my headline is misleading. I am not sure how to put it appropriately for my question.

I am currently working on the fixed-effect model. My data looks like the following table, although it is not actual data due to the information privacy.

state district year grade Y X id
AK 1001 2009 3 0.1 0.5 1001.3
AK 1001 2010 3 0.8 0.4 1001.3
AK 1001 2011 3 0.5 0.7 1001.3
AK 1001 2009 4 1.5 1.3 1001.4
AK 1001 2010 4 1.1 0.7 1001.4
AK 1001 2011 4 2.1 0.4 1001.4
... ... ... .. .. .. ...
WY 5606 2011 6 4.2 5.3 5606.6

I used the fixest package to run the fixed-effect model for this project. To get the unique observation in this dataset, I have to combine district, grade, and year. Note that I avoided using plm because there is no way to specify three fixed effects in the model unless you combine two identities (in my case, I generated id by combining district and grade). fixest seems to be able to solve this problem. However, I got different results when specifying three fixed effects (district, grade, and year) compared to two fixed effects (id and year). The following results and codes may clear up some confusion from my explanation.

# Two fixed effects (id and year)
df <- transform(df, id = apply(df[c("district", "grade")], 1, paste, collapse = "."))
fe = feols(y ~ x | id + year, df, se = "standard")
summary(fe)

OLS estimation, Dep. Var.: y
Observations: 499,112 
Fixed-effects: id: 64,302,  year: 10
Standard-errors: IID 
  Estimate Std. Error t value   Pr(>|t|)    
X 0.012672   0.003602 3.51804 0.00043478 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    RMSE: 0.589222     Adj. R2: 0.761891
                     Within R2: 2.846e-5

###########################################################################

# Three fixed effects (district, grade, and year)
fe = feols(y ~ x | district + grade + year, df, se = "standard")
summary(fe)

OLS estimation, Dep. Var.: y
Observations: 499,112 
Fixed-effects: district: 11,097,  grade: 6,  year: 10
Standard-errors: IID 
  Estimate Std. Error t value   Pr(>|t|)    
X 0.014593    0.00401 3.63866 0.00027408 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 0.702543     Adj. R2: 0.698399
                 Within R2: 2.713e-5

Questions:

  1. Why are the results different?
  2. This is an equation I plan to use;enter image description here. I am not sure which model is associated with this specification. To my feeling, it could be the second one. But if it is the case, why do many websites recommend combining two identities and running normal plm.

Thank you so much for reading my question. Any answers/ suggestions/ advice would be appreciated!

Upvotes: 0

Views: 1133

Answers (1)

Laurent Berg&#233;
Laurent Berg&#233;

Reputation: 1372

The answer is simply that you are estimating two different models.

Three fixed-effects (FEs):

enter image description here

Year + id FEs (I renamed id in to district_grade):

enter image description here

The first set of fixed-effects is strictly included in the set of FEs of the second estimation, which is more restrictive.

Here is a reproducible example in which we see that we obtain two different sets of coefficients.

data(trade)
est = fepois(Euros ~ log(dist_km) | sw(Origin + Product, Origin^Product) + Year, trade)

etable(est, vcov = "iid")
#>                             model 1             model 2
#> Dependent Var.:               Euros               Euros
#>                                                        
#> log(dist_km)    -1.020*** (1.18e-6) -1.024*** (1.19e-6)
#> Fixed-Effects:  ------------------- -------------------
#> Origin                          Yes                  No
#> Product                         Yes                  No
#> Year                            Yes                 Yes
#> Origin-Product                   No                 Yes
#> _______________ ___________________ ___________________
#> S.E. type                       IID                 IID
#> Observations                 38,325              38,325
#> Squared Cor.                0.27817             0.35902
#> Pseudo R2                   0.53802             0.64562
#> BIC                        2.75e+12            2.11e+12
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can see that they have different FEs, which confirms that the models estimated are completely different:

summary(fixef(est$`Origin + Product + Year`))
#> Fixed_effects coefficients
#>                         Origin Product  Year
#> Number of fixed-effects     15      20    10
#> Number of references         0       1     1
#> Mean                      23.5  -0.012 0.157
#> Standard-deviation        1.15    1.35 0.113
#> COEFFICIENTS:
#>   Origin:    AT    BE    DE    DK    ES                 
#>           22.91 23.84 24.62 23.62 24.83 ... 10 remaining
#> -----
#>   Product: 1     2     3     4      5                 
#>            0 1.381 0.624 1.414 -1.527 ... 15 remaining
#> -----
#>   Year: 2007    2008     2009    2010  2011                
#>            0 0.06986 0.006301 0.07463 0.163 ... 5 remaining

summary(fixef(est$`Origin^Product + Year`))
#> Fixed_effects coefficients
#>                         Origin^Product  Year
#> Number of fixed-effects            300    10
#> Number of references                 0     1
#> Mean                              23.1 0.157
#> Standard-deviation                1.96 0.113
#> COEFFICIENTS:
#>   Origin^Product:   101   102   103   104   105                  
#>                   22.32 24.42 24.82 21.28 23.04 ... 295 remaining
#> -----
#>   Year: 2007    2008     2009    2010   2011                
#>            0 0.06962 0.006204 0.07454 0.1633 ... 5 remaining

Upvotes: 1

Related Questions