User456898
User456898

Reputation: 5724

How to implement Poisson Regression?

There are 2 types of Generalized Linear Models:
1. Log-Linear Regression, also known as Poisson Regression
2. Logistic Regression

How to implement the Poisson Regression in Python for Price Elasticity prediction?

Upvotes: 11

Views: 28881

Answers (2)

Ben
Ben

Reputation: 784

If I am not mistaken, @Altons' answer is for GEEs, which assume some sort of grouped structure. The common Poisson Regression (without a need for a group, such as "subject") is implemented as General Linear Model in statsmodels:

import patsy
import statsmodels as sm
from statsmodels.genmod.families import Poisson


fam = Poisson()
f = 'some_count ~ some_numeric_variable + C(some_categorical_variable)'
y, X = patsy.dmatrices(f, data, return_type='matrix')

p_model = sm.GLM(y, X, family=fam)

result = p_model.fit()
print(result.summary())

The variables used in the formula are just placeholders for variables in the DataFrame data.

Upvotes: 6

Altons
Altons

Reputation: 1424

Have a look at the statmodels package in python.

Here is an example

A bit more of input to avoid the link only answer

Assumming you know python here is an extract of the example I mentioned earlier.

import numpy as np
import pandas as pd
from statsmodels.genmod.generalized_estimating_equations import GEE
from statsmodels.genmod.cov_struct import (Exchangeable,
    Independence,Autoregressive)
from statsmodels.genmod.families import Poisson

pandas will hold the data frame with the data you want to use to feed your poisson model. statsmodels package contains large family of statistical models such as Linear, probit, poisson etc. from here you will import the Poisson family model (hint: see last import)

The way you fit your model is as follow (assuming your dependent variable is called y and your IV are age, trt and base):

fam = Poisson()
ind = Independence()
model1 = GEE.from_formula("y ~ age + trt + base", "subject", data, cov_struct=ind, family=fam)
result1 = model1.fit()
print(result1.summary())

As I am not familiar with the nature of your problem I would suggest to have a look at negative binomial regression if you need to count data is well overdispersed. with High overdispersion your poisson assumptions may not hold.

Plethora of info for poisson regression in R - just google it.

Hope now this answer helps.

Upvotes: 20

Related Questions