cordoba27
cordoba27

Reputation: 11

Using categorical variable in spreg models

I would like to employ a spatial regression model, using the spreg package in Python. My data consists of numeric variables, but I also have a categorical land cover variable (with 7 classes) that I need to include in the model. This works perfectly fine using statsmodels, but I haven't been able to figure out how to do this in spreg.

I have tried creating dummy variables manually (using pd.get_dummies(data['land_cover'])), but this results in an error message for my spreg.OLS model:

RuntimeWarning: invalid value encountered in sqrt se_result =np.sqrt(variance)

RuntimeWarning: invalid value encountered in sqrt tStat = betas[list(range(0, len(vm)))].reshape(len(vm),) / np.sqrt(variance)

All the dummy variables also have nan values in the Std.Error, t-Statistic and Probability sections of the results (see excerpt below).

        Variable     Coefficient       Std.Error     t-Statistic     Probability

        CONSTANT    -142.9375000             nan             nan             nan
     temperature       0.0136240       0.0001169     116.4984154       0.0000000
   precipitation       0.0000003       0.0000000     153.7448775       0.0000000
         cover_1     141.9375000             nan             nan             nan
         cover_2     142.0625000             nan             nan             nan
         cover_3     141.6875000             nan             nan             nan
         cover_4     142.0625000             nan             nan             nan
         cover_5     141.9375000             nan             nan             nan
         cover_6     141.6875000             nan             nan             nan
         cover_7     141.8125000             nan             nan             nan

Using statsmodels with the same data/variables, the output of the OLS model was this:

                            coef    std err          t      P>|t|
     temperature         -0.0004   2.72e-05    -15.115      0.000
   precipitation       -1.62e-08   4.12e-10    -39.294      0.000
         cover_1          0.0706      0.001    119.653      0.000
         cover_2          0.0290      0.001     29.431      0.000
         cover_3          0.0100      0.001      7.120      0.000 
         cover_4          0.0491      0.000    122.972      0.000
         cover_5          0.0327      0.000     79.698      0.000 
         cover_6          0.0140      0.000     35.541      0.000 
         cover_7         -0.0026      0.001     -4.223      0.000 

How can I include my categorical data into the spreg models (e.g spreg.GM_Lag)?

Upvotes: 1

Views: 209

Answers (1)

Josef
Josef

Reputation: 22897

My guess is that you ran into the "dummy variable trap".

You don't have a constant in the statsmodels version, but it is included in the spreg version.

If you don't drop a reference level in your categorical variable, then it will be perfectly collinear with the constant. The design matrix will be singular and the standard product matrix x'x is not invertible.

Upvotes: 1

Related Questions