user3378649
user3378649

Reputation: 5354

problems using GLM

I have some difficulties understanding how to use GLM model with poisson.

import numpy as np
import scikits.statsmodels as sm

dataset = pd.DataFrame({'A':np.random.rand(100)*1000, 
                        'B':np.random.rand(100)*100,  
                        'C':np.random.rand(100)*10, 
                        'target':np.random.rand(100)})

X = dataset.ix[:,['A','B','C']].values
y = dataset.ix[:,['target']].values
size = 1e5
nbeta = 3

fam = sm.families.Poisson()
glm = sm.GLM(y,X, family=fam)
res = glm.fit()

Upvotes: 0

Views: 2369

Answers (1)

jseabold
jseabold

Reputation: 8283

Sourceforge is down right now. When it's back up, you should read through the documentation and examples. There are plenty of usage notes for prediction and GLM.

How to label your target is up to you and probably a question for cross-validated. Poisson is intended for counts but can be used on continuous data, but you should know what you're doing.

If you have 0/1 then you want a Logit or Probit model. Something like this. You don't need to convert the pandas objects to numpy.

import numpy as np
import statsmodels.api as sm

dataset = pd.DataFrame({'A':np.random.rand(100)*1000, 
                        'B':np.random.rand(100)*100,  
                        'C':np.random.rand(100)*10, 
                        'target':np.random.randint(0, 5, 100)})

X = dataset[['A','B','C']]
X['constant'] = 1
y = dataset['target']
size = 1e5
nbeta = 3

fam = sm.families.Poisson()
glm = sm.GLM(y,X, family=fam)
res = glm.fit()

predict = res.predict()

Or you could directly use the maximum likelihood estimator for Poisson.

res = sm.Poisson(y, X).fit()
predict = res.predict()

Upvotes: 2

Related Questions