Reputation: 19624
I am using VAR model to forecast multivariate time series with lag 2. I have three features, and would like to forcast several timestamps forward. Instead of forcasting all the three features, I actually know the values of two of the features, and would like to forcast only one feature.
If I wanted to forcast all the three features 5 timestamps a head, I could have done that as follows (this is a toy example):
import pandas as pd
from statsmodels.tsa.api import VAR
data=pd.DataFrame({'Date':['1959-06-01','1959-06-02','1959-06-03','1959-06-04']\
,'a':[1,2,3,5],'b':[2,3,5,8],'c':[3,4,7,11]})
data.set_index('Date', inplace=True)
model = VAR(data)
results = model.fit(2)
results.forecast(data.values[-2:], 5)
Note that data
is
a b c
Date
1959-06-01 1 2 3
1959-06-02 2 3 4
1959-06-03 3 5 7
1959-06-04 5 8 11
And the forecast gives me
array([[ 8.01388889, 12.90277778, 17.79166667],
[ 12.93113426, 20.67650463, 28.421875 ],
[ 20.73343461, 33.12405961, 45.51468461],
[ 33.22366195, 52.98948789, 72.75531383],
[ 53.15895736, 84.72805652, 116.29715569]])
Let's say I knew that the next 5 values for a
should have actually been 8,13,21,34,55
and the next 5 values for b
should have been 13,21,34,55,89
. Is there a way to incorporate that into the model in statsmodels.tsa
(or any other python package) to forecast only the 5 values of c
? I know that R
has such an option, by incorporating "hard" conditions into cpredict.VAR
, but I was wondering if this can be done in python as well.
The above is a toy example. In reality I have several dozens of features, but I still know all of them and would like predict only one of them using VAR model.
Upvotes: 4
Views: 2267
Reputation: 21
I have a similar issue when solving this problem. This is a makeshift method to accomplish what you are asking.
prediction = model_fit.forecast(model_fit.y, steps=len(test))
predictions = prediction[:,0]
` Where 0 in the prediction[:,0] refers to the column that contains the desired forecasting value.
Upvotes: 1