Reputation: 147
I'm trying to forecast future values from my monthly dataset (the data is summarized as first day of a month, 12 times a year) and I'm encountering:
ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.
I've tried to run around Google and StackO but failed to get a relevant thread and a good enough solution.
This is head(13) of my dataframe:
Occupancy rate Average Price RevPAR
Date
2013-01-01 0.579026 105.289497 60.965332
2013-02-01 0.637415 109.396682 69.731070
2013-03-01 0.714847 117.840534 84.237901
2013-04-01 0.716446 122.765139 87.954593
2013-05-01 0.771097 105.461387 81.320985
2013-06-01 0.768777 115.252163 88.603262
2013-07-01 0.677020 81.824781 55.396987
2013-08-01 0.673639 72.489988 48.832110
2013-09-01 0.783291 125.034417 97.938296
2013-10-01 0.779694 118.724648 92.568902
2013-11-01 0.771430 113.322446 87.420366
2013-12-01 0.680166 100.950857 68.663388
2014-01-01 0.573320 102.881633 58.984090
And this is the very basic fitting I'm trying to run for the very beginning.
model = VAR(df)
results = model.fit(2)
results.forecast(df.values[-2:], 5)
results.summary()
I'm assuming I need to set some kind of a frequency attribute to the dataframe. I've tried doing a brute df.asfreq('M') but it simply messes up my data.
Upvotes: 4
Views: 3098
Reputation: 620
I don't know the model you are using, however most likely it's either caused by the missing values in the time series or by the non matched freq
(freq
for the month beginning is MS
).
So as I think, you can create a new time series with pd.date_range, then reindex the dataframe with the created time series.
if the input dataframe is:
In [10]: df
Out[10]:
0 1
2018-01-01 2 1
2018-03-01 0 0
we can then create a new time series:
In [12]: index = pd.date_range(start=df.index.min(), end=df.index.max(), freq='MS')
In [13]: index
Out[13]: DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01'], dtype='datetime64[ns]', freq='MS')
then reindex the dataframe
In [14]: df.reindex(index)
Out[14]:
0 1
2018-01-01 2.0 1.0
2018-02-01 NaN NaN
2018-03-01 0.0 0.0
and additionally we can fill the Nan
values in the dataframe with some appropriate values to meet the model training.
Upvotes: 4