Pytno
Pytno

Reputation: 67

categorical variable panel ols

For my PanelOLS i like to include categorical variables. This is my model:

import statsmodels.api as sm

exog_vars = ['x1', 'x2', 'x3']
exog = sm.add_constant(df[exog_vars])
mod = PanelOLS(df.y, exog, entity_effects=True, time_effects=True)
result = mod.fit(cov_type='clustered', cluster_entity=True)

The categorial variable is a number for a industry. This nummber is stored in my dataframe(df['x4']). Do you know how to include categorical variables? Or do you need more information to answer the question.

My dataframe: enter image description here

I tried:

df['x4'] = pd.Categorical(gesamt.x4)

mod = PanelOLS(gesamt.CAR, exog, other_effects=df['x4'], entity_effects=True, time_effects=True)

The follwing error occured:

raise ValueError('At most two effects supported.')

ValueError: At most two effects supported.

Upvotes: 0

Views: 793

Answers (1)

ignoring_gravity
ignoring_gravity

Reputation: 10486

The simplest way to do this is probably to one-hot-encode your column x4.

If you have

df = pd.DataFrame({'x1': [1,2,3], 'x4': ['bob', 'cat' ,'cat']})
df

which looks like

   x1   x4
0   1  bob
1   2  cat
2   3  cat

then

pd.get_dummies(df, 'x4')

gives you

   x1  x4_bob  x4_cat
0   1       1       0
1   2       0       1
2   3       0       1

Alternatively,

df['x4'] = pd.Categorical(df.x4).codes
df

will give you

   x1  x4
0   1   0
1   2   1
2   3   1

Upvotes: 1

Related Questions