Predicting claim number through GLM model

Question

I'm conducting a case study where I have to predict claim number per policy. Since my variable ClaimNb is not binary I can't use logistic Regression but I have to use Poisson. My code for GLM model:


 import statsmodels.api as sm
  
  import statsmodels.formula.api as smf
  
  formula= 'ClaimNb ~ BonusMalus+VehAge+Freq+VehGas+Exposure+VehPower+Density+DrivAge'
  
  model = smf.glm(formula = formula, data=df,
  family=sm.families.Poisson())

I have also split my data


   # train-test-split   
   train , test = train_test_split(data,test_size=0.2,random_state=0)
   
   # seperate the target and independent variable
   train_x = train.drop(columns=['ClaimNb'],axis=1)
   train_y = train['ClaimNb']
   
   test_x = test.drop(columns=['ClaimNb'],axis=1)
   test_y = test['ClaimNb']

My problem now is the prediction, I have used the following but did not work:

    from sklearn.linear_model import PoissonRegressor model = PoissonRegressor(alpha=1e-3, max_iter=1000)
    
    model.fit(train_x,train_y)
    
    predict = model.predict(test_x)

Please is there any other way to predict and check the accuracy of the model?

thanks

Predicting claim number through GLM model

Answers (1)

Related Questions