Reputation: 67
For my PanelOLS i like to include categorical variables. This is my model:
import statsmodels.api as sm
exog_vars = ['x1', 'x2', 'x3']
exog = sm.add_constant(df[exog_vars])
mod = PanelOLS(df.y, exog, entity_effects=True, time_effects=True)
result = mod.fit(cov_type='clustered', cluster_entity=True)
The categorial variable is a number for a industry. This nummber is stored in my dataframe(df['x4']
).
Do you know how to include categorical variables? Or do you need more information to answer the question.
I tried:
df['x4'] = pd.Categorical(gesamt.x4)
mod = PanelOLS(gesamt.CAR, exog, other_effects=df['x4'], entity_effects=True, time_effects=True)
The follwing error occured:
raise ValueError('At most two effects supported.')
ValueError: At most two effects supported.
Upvotes: 0
Views: 793
Reputation: 10486
The simplest way to do this is probably to one-hot-encode your column x4
.
If you have
df = pd.DataFrame({'x1': [1,2,3], 'x4': ['bob', 'cat' ,'cat']})
df
which looks like
x1 x4
0 1 bob
1 2 cat
2 3 cat
then
pd.get_dummies(df, 'x4')
gives you
x1 x4_bob x4_cat
0 1 1 0
1 2 0 1
2 3 0 1
Alternatively,
df['x4'] = pd.Categorical(df.x4).codes
df
will give you
x1 x4
0 1 0
1 2 1
2 3 1
Upvotes: 1