Reputation: 31
I am trying to forecast demand based on a 6 years dataset 1/1/2014==> 1/1/2020. first I tried to regroup demand by month and so I ended up with a dataset of 2 columns ( month and sales) and 72rows ( 12month*6years). P.s: I am working with python.
My first question is: is it enough to get predictions of the next year( 2020), knowing the fact that i only have 72 rows.
My second question is, are there any models you can advise me to work with and that would give me a good accuracy?
I have tried working with arima model combined with seasonnality ( sarimax) and LSTM tho it didnt work, I am not sure if i am doing it right.
My third question is : Are there any test in python that tell you if there is seasonnality or not?
#shrink the dataset
dataa=data[(data['Produit']=='ACP NOR/STD')&(data['Région']=='Europe')]
gb2=dataa.groupby(by=[dataa['Mois'].dt.strftime('%Y, %m')])['Chargé (T)'].sum().reset_index()
gb2.Mois=pd.to_datetime(gb2.Mois)
[#create a time serie][2]
series = pd.Series(gb2['Chargé (T)'].values, index=gb2.Mois)
#decompose the dataset to 3 things: trend, seasonality and noise
from pylab import rcParams
import statsmodels.api as sm
rcParams['figure.figsize'] = 18, 8
decomposition = sm.tsa.seasonal_decompose(series, model='additive')
fig = decomposition.plot()
plt.show()
#calculate acf and pacf to know in which order to stop
from statsmodels.graphics.tsaplots import plot_acf
from statsmodels.graphics.tsaplots import plot_pacf
from matplotlib import pyplot
pyplot.figure()
pyplot.subplot(211)
plot_acf(series, ax=pyplot.gca())
pyplot.subplot(212)
plot_pacf(series, ax=pyplot.gca())
pyplot.show()
import itertools
p = d = q = range(0, 5)
pdq = list(itertools.product(p, d, q))
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]
print('Examples of parameter combinations for Seasonal ARIMA...')
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[1]))
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[2]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[3]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[4]))
import warnings
warnings.filterwarnings("ignore")
for param in pdq:
for param_seasonal in seasonal_pdq:
try:
mod = sm.tsa.statespace.SARIMAX(series,
order=param,
seasonal_order=param_seasonal,
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()
print('ARIMA{}x{}12 - AIC:{}'.format(param, param_seasonal, results.aic))
except:
continue
mod = sm.tsa.statespace.SARIMAX(series,
order=(0, 1, 2),
seasonal_order=(0, 4, 0, 12),
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()
print(results.summary().tables[1])
results.plot_diagnostics(figsize=(16, 8))
plt.show()
#get predictions
pred = results.get_prediction(start=pd.to_datetime('2019-01-01'), dynamic=False)
pred_ci = pred.conf_int()
ax = series['2014':].plot(label='observed')
pred.predicted_mean.plot(ax=ax, label='One-step ahead Forecast', alpha=.8, figsize=(14, 7))
ax.fill_between(pred_ci.index,
pred_ci.iloc[:, 0],
pred_ci.iloc[:, 1], color='k', alpha=.2)
ax.set_xlabel('Date')
ax.set_ylabel('Chargé (T)')
plt.legend()
plt.show()
The predictions have nothing to do with reality... I would really appreciate anyone s help.
Upvotes: 1
Views: 736
Reputation: 51
I`ve done a lot of research on it and had one project on ts predicting, here is example, where are described all steps :
Upvotes: 1
Reputation: 31
Answer to your First Question: Data you have collected looks small and it would be great if you can collect day wise so that your model can do great. Since, Recurrent Neural Nets perform well with data elements collected with less time difference I suggest you to collect data day wise that can take you to (12 x 30 x 6) It can become the best feed in to any model.
Answer to Second Question: I personally suggest you to make a try with LSTM's with more data an valuable parameters and a good collection is given in this Medium Post.Medium Post
Performance varies with variation in parameters so be cautious in selecting parameters that are being fed in.
Answer to Third Question: Seasonality is generally detected using the technique called "Anomaly Detection". A small discussion is made on that too in the medium post given above.
Upvotes: 0