Reputation: 201
I am trying to do a panel regression with
from linearmodels.panel import PooledOLS
import statsmodels.api as sm
but I encountered a problem with my index. For the regression I need a Multiindex, so I have a Dummy variable and the time (see below). The two indicies are a_c ( a dummy variable for all 10 countries of my analysis) and Timestamp (the date).
i_ct preemu postemu γ1 γ2
α_c Timestamp
10 2020-06-24 -0.04 0.00 0.02 -1.110223e-16 2.000000e-02
2020-06-25 0.05 0.00 -0.04 -1.000000e-02 1.000000e-02
2020-06-26 0.02 0.00 0.05 0.000000e+00 1.000000e-02
2020-06-29 0.00 0.00 0.02 -2.000000e-02 -1.110223e-16
2020-06-30 0.08 0.00 0.00 2.000000e-02 -6.000000e-02
When I run the regression
exog_vars = ['preemu','postemu','γ1','γ2']
exog = sm.add_constant(beta_panel[exog_vars])
mod = PooledOLS(beta_panel.i_ct, exog)
pooled_res = mod.fit()
print(pooled_res)
I get this error:
ValueError: The index on the time dimension must be either numeric or date-like
But when the timestamp is not in the index I get this error:
Series can only be used with a 2-level MultiIndex
Anyone knows why it could throw the first error?
Upvotes: 4
Views: 3281
Reputation: 476
The time dimension index in the example provided looks like a string. Converting it to a date format or a numerical format should work.
Also note that the entity index comes first in order and the time dimension is second in the multi-index, like the example provided has correctly done.
A small oversight that can cause errors.
Upvotes: 2
Reputation: 1
Change the string date time type to numerical (I tried to change it to timestamp or other time datatype, but the error was still existing), for example: 2021/01/31
change to 20210131
. Then it works.
Upvotes: 0