Reputation: 1706
I'm trying to run X-13-ARIMA model from statsmodels library in python 3.
I found this example in statsmodels documentation:
dta = sm.datasets.co2.load_pandas().data
dta.co2.interpolate(inplace=True)
dta = dta.resample('M').sum()
res = sm.tsa.x13_arima_select_order(dta.co2)
print(res.order, res.sorder)
results = sm.tsa.x13_arima_analysis(dta.co2)
fig = results.plot()
fig.set_size_inches(12, 5)
fig.tight_layout()
This works fine, but I also need to predict future values of this time series. The tsa.x13_arima_analysis()
function contains forecast_years
parameter, so I suppose it should be possible. However; the results
object doesn't seem to change no matter what value of forecast_years
parameter I choose.
How can I get the forecast values?
Upvotes: 2
Views: 10137
Reputation: 2187
Late response but hopefully helpful.
result = x13_arima_analysis(df[dependent_var],
tempdir=output_directory,
forecast_periods=12)
Then to find the forecast table inside of the result.results
output you must search by the id of the table name being id=fct
.
from bs4 import BeautifulSoup
import pandas as pd
result_string = result.results
soup = BeautifulSoup(result_string, 'lxml')
specific_section = soup.find('div', id='fct')
table = specific_section.find('table') if specific_section else None
This will give you the "string" version of the table. Then you can parse however you would like. I needed a dataframe (example below).
<table class="w70" summary="Confidence intervals with coverage probability
( 0.95000)">
<caption><strong>Confidence intervals with coverage probability ( 0.95000) <br/> On the Original Scale</strong></caption>
<tr>
<th scope="col">Date</th>
<th scope="col">Lower</th>
<th scope="col">Forecast</th>
<th scope="col">Upper</th>
</tr>
<tr>
<th scope="row">2023.Aug</th>
<td> 3560.68 </td>
<td> 3694.21 </td>
<td> 3832.74 </td>
</tr>
<tr>
<th scope="row">2023.Sep</th>
<td> 3393.61 </td>
<td> 3579.02 </td>
<td> 3774.55 </td>
</tr>
<tr>
<th scope="row">2023.Oct</th>
<td> 3275.37 </td>
<td> 3491.64 </td>
<td> 3722.18 </td>
Example:
if table:
dates, lowers, forecasts, uppers = [], [], [], []
for row in table.find_all('tr')[1:]: # skip the header row
columns = row.find_all('td')
date = row.find('th', scope='row').text
lower, forecast, upper = [col.text.strip() for col in columns]
dates.append(date.replace('.', '_'))
lowers.append(float(lower))
forecasts.append(float(forecast))
uppers.append(float(upper))
# Create a DataFrame
df = pd.DataFrame({
'Date': dates,
'Lower': lowers,
'Forecast': forecasts,
'Upper': uppers
})
>>
Date Lower Forecast Upper
0 2023_Aug 3560.68 3694.21 3832.74
1 2023_Sep 3393.61 3579.02 3774.55
2 2023_Oct 3275.37 3491.64 3722.18
3 2023_Nov 3070.97 3318.68 3586.36
4 2023_Dec 2895.54 3162.95 3455.05
5 2024_Jan 2884.05 3179.71 3505.69
6 2024_Feb 2901.60 3228.93 3593.18
7 2024_Mar 2979.65 3343.18 3751.07
8 2024_Apr 3008.83 3401.86 3846.23
9 2024_May 3068.21 3494.67 3980.41
10 2024_Jun 3089.83 3543.59 4064.00
11 2024_Jul 3066.04 3539.45 4085.96
Upvotes: 0
Reputation: 21
forecast_years=x worked for me. Pay attention to the version of statsmodels you are running ("pip freeze | grep statsmodels") as for version 10.2 the correct parameter for forecasting horizon is <forecast_years> but in version 11.0 and higher the correct parameter is <forecast_periods>.
A simple regex should do the trick to find your forecast values:
202\d.\w{3}\s{6}\d\d.\d\d\s{5}\d\d.\d\d\s{5}\d\d.\d\d
(run on each line of your results)
which would match:
2020.Feb 18.04 32.25 46.47
Upvotes: 1
Reputation: 21663
By now you probably have this yourself. I retrieved some monthly weather data that ends in July of 2012. I entered this statement to do the analysis.
results = sm.tsa.x13_arima_analysis(s, forecast_years=3)
Then (having found that results.results
is voluminous) I entered this.
open('c:/scratch/result.txt', 'w').write(results.results)
Peering through this file for 'forecast' I found the following section.
FORECASTING
Origin 2012.Jul
Number 3
Forecasts and Standard Errors of the Prior Adjusted Data
------------------------------
Standard
Date Forecast Error
------------------------------
2012.Aug 33.02 2.954
2012.Sep 28.31 2.954
2012.Oct 21.54 2.954
------------------------------
Confidence intervals with coverage probability ( 0.95000
---------------------------------------
Date Lower Forecast Upper
---------------------------------------
2012.Aug 27.23 33.02 38.82
2012.Sep 22.52 28.31 34.10
2012.Oct 15.75 21.54 27.33
---------------------------------------
forecast_years=3
seems to be taken to mean make a forecast of three months, in this case starting after July.
Upvotes: 2