Reputation: 3397
Is there a way to make predictions from a fixest
model on an observation that has an out-of-sample fixed effects level? I would like this prediction to be based on the weighted mean of the existing fixed effects levels in the training data. For the weights I would like to use the number of observations for each FE level.
Currently, I am re-estimating a model without fixed effects and use it for prediction when the full model yields a missing value. However, I am looking for a solution without re-estimating or updating the model, similar to using the na.fill
argument in a plm
model (see this Stackoverflow answer).
In the example below, the Product variable takes on integers from 1 to 20 in the training data, so the prediction of 21 yields a missing value:
library(tidyverse)
library(fixest)
# fit model
data(trade)
mod <- feols(log(Euros) ~ log(dist_km) | Product, trade)
# define new data
df <- tribble(
~dist_km, ~Product,
140, 20, # in sample
140, 21 # out of sample
)
# no prediction for FE level 21 that is not in the training data
predict(mod, newdata = df)
#> [1] 20.14376 NA
Created on 2023-03-08 with reprex v2.0.2
Using a model without fixed effects, the value that is currently missing would be replaced by 18.88489
.
Upvotes: 2
Views: 754
Reputation: 9310
I do not think that this is possible atm with the fixest
package. You could do it manually, e.g.
oos <- fixef(mod) |> purrr::map_dbl(function(x){
# weighted mean
sum(x)/length(x)
})
predict(mod, newdata = df) |> tidyr::replace_na(oos)
[1] 20.14376 28.84476
Upvotes: 0