Reputation: 21
I'm trying to convert the following code from R to Python using the Statsmodels module:
model <- glm(goals ~ att + def + home - (1), data=df, family=poisson, weights=weight)
I've got a similar dataframe (named df) using pandas, and currently have the following line in Python (version 3.4 if it makes a difference):
model = sm.Poisson.from_formula("goals ~ att + def + home - 1", df).fit()
Or, using GLM:
smf.glm("goals ~ att + def + home - 1", df, family=sm.families.Poisson()).fit()
However, I can't get the weighting terms to work. Each record in the dataframe has a date, and I want more recent records to be more valuable for fitting the model than older ones. I've not seen an example of it being used, but surely if it can be done in R, it can be done on Statsmodels... right?
Upvotes: 2
Views: 2461
Reputation: 1
There are two solutions for setting up weights for Poisson regression. The first is to use freq_weigths
in the GLM function as mentioned by MarkWPiper. The second is to just go with Poisson regression and pass the weights to exposure
. As documented here: "Log(exposure) is added to the linear prediction with coefficient equal to 1." This does the same mathematical trick as mentioned by Yaron, although the parameter has a different original meaning. A sample code is as follows:
import statsmodels.api as sm
# or: from statsmodels.discrete.discrete_model import Poisson
fitted = sm.Poisson.from_formula("goals ~ att + def + home - 1", data=df, exposure=df['weight']).fit()
Upvotes: 0
Reputation: 923
freq_weights
is now supported on GLM Poisson, but unfortunately not on sm.Poisson
To use it, pass freq_weights when creating the GLM:
import statsmodels.api as sm
import statsmodels.formula.api as smf
formula = "goals ~ att + def + home - 1"
smf.glm(formula, df, family=sm.families.Poisson(), freq_weights=df['freq_weight']).fit()
Upvotes: 2
Reputation: 1852
I've encountered the same issue. there is a workaround that should lead to same results. add the weight in logarithm scale (np.log(weight)) you need as one of the explanatory variables with beta equal to 1 (offset option). I can see there is an option for the exposure which doing the same as I explained above.
Upvotes: 0