Reputation: 5724
There are 2 types of Generalized Linear Models:
1. Log-Linear Regression, also known as Poisson Regression
2. Logistic Regression
How to implement the Poisson Regression in Python for Price Elasticity prediction?
Upvotes: 11
Views: 28881
Reputation: 784
If I am not mistaken, @Altons' answer is for GEEs, which assume some sort of grouped structure. The common Poisson Regression (without a need for a group, such as "subject") is implemented as General Linear Model in statsmodels
:
import patsy
import statsmodels as sm
from statsmodels.genmod.families import Poisson
fam = Poisson()
f = 'some_count ~ some_numeric_variable + C(some_categorical_variable)'
y, X = patsy.dmatrices(f, data, return_type='matrix')
p_model = sm.GLM(y, X, family=fam)
result = p_model.fit()
print(result.summary())
The variables used in the formula are just placeholders for variables in the DataFrame data
.
Upvotes: 6
Reputation: 1424
Have a look at the statmodels package in python.
Here is an example
A bit more of input to avoid the link only answer
Assumming you know python here is an extract of the example I mentioned earlier.
import numpy as np
import pandas as pd
from statsmodels.genmod.generalized_estimating_equations import GEE
from statsmodels.genmod.cov_struct import (Exchangeable,
Independence,Autoregressive)
from statsmodels.genmod.families import Poisson
pandas
will hold the data frame with the data you want to use to feed your poisson model.
statsmodels
package contains large family of statistical models such as Linear, probit, poisson etc. from here you will import the Poisson family model (hint: see last import)
The way you fit your model is as follow (assuming your dependent variable is called y
and your IV are age, trt and base):
fam = Poisson()
ind = Independence()
model1 = GEE.from_formula("y ~ age + trt + base", "subject", data, cov_struct=ind, family=fam)
result1 = model1.fit()
print(result1.summary())
As I am not familiar with the nature of your problem I would suggest to have a look at negative binomial regression if you need to count data is well overdispersed. with High overdispersion your poisson assumptions may not hold.
Plethora of info for poisson regression in R - just google it.
Hope now this answer helps.
Upvotes: 20