chrismoltisanti
chrismoltisanti

Reputation: 39

ARIMA/SARIMAX for time series forecasting

I am trying to forecast sales of products for more than 2000 products. In my data, I resample each products' sales data into weekly sales data and each product time series data behaves differently. Seasonal patterns are not obvious and that is why I decided to use auto_arima function in Python for two different conditions which assumes there is seasonality and there is not. For the seasonality case, I assumed period is 52 weeks because peaks in seasonal decomposition of data was observed same after 1 year period. Now, my question is that is it good practice to try two different conditions for auto arima function and captures the best model(ARIMA or SARIMAX) that gives lowest mse? Also, auto_arima function works very slow while it tries to find the order of sarimax model. I wil be glad to hear any advice for speeding up and my first question.

Thanks.

df_models = pd.DataFrame()
df_model_results = pd.DataFrame()

for k in range(len(df_stationary_items)):
 
 test_df = grouped_df.get_group(df_stationary_items[k])
 X = test_df['Quantity'].values
 train, test = X[0:len(X)-1], X[len(X)-1:]
 try:
     stepwise_fit = auto_arima(test_df['Quantity'], start_p=0, start_q=0,
                           max_p=6, max_q=6,m=52,
                           start_P=0,seasonal=True,alpha=0.05,
                           d=None,D=None, max_D=1 ,trace=True,n_jobs=-1,
                           error_action='ignore',stepwise=True)
     df_models =df_models.append({"ItemNo": df_stationary_items[k], "Order": stepwise_fit.order,"SeasonalOrder": stepwise_fit.seasonal_order},ignore_index=True)
        
     model = SARIMAX(train, order=stepwise_fit.order,seasonal_order=stepwise_fit.seasonal_order)
     model_fit = model.fit()
     predictions = model_fit.predict(start=len(train), end=len(train)+len(test)-1, dynamic=False)
     rmse= sqrt(mean_squared_error(test, predictions))
     df_model_results =df_model_results.append({"ItemNo": df_stationary_items[k],"StationaryP":result[1] ,"Order": stepwise_fit.order,"SeasonalOrder": stepwise_fit.seasonal_order,"Predicted":predictions[0],"Expected":test[0],"STDEV":test_df['Quantity'].std(),"rmse":rmse},ignore_index=True)
 except:
     continue
     
df_test_results_nonseasonal = pd.DataFrame()
df_model_results_nonseasonal = pd.DataFrame()
df_models_nonseasonal=pd.DataFrame()

for m in range(len(df_stationary_items)):
    test_df_nonseasonal = grouped_df.get_group(df_stationary_items[m])    
    X_non = test_df_nonseasonal['Quantity'].values
    train_non, test_non = X_non[0:len(X_non)-1], X_non[len(X_non)-1:]
    try:
    

        stepwise_nonseasonal = auto_arima(test_df_nonseasonal['Quantity'],error_action='ignore',seasonal=False)
        df_models_nonseasonal =df_models_nonseasonal.append({"ItemNo": df_stationary_items[m], "Order": stepwise_nonseasonal.order},ignore_index=True)
        model_non = ARIMA(train_non, order=stepwise_nonseasonal.order)
        model_fit_non = model_non.fit()
        predictions_non = model_fit_non.predict(start=len(train_non), end=len(train_non)+len(test_non)-1, dynamic=False)
        rmse_non= sqrt(mean_squared_error(test_non, predictions_non))
        df_model_results_nonseasonal =df_model_results_nonseasonal.append({"ItemNo": df_stationary_items[m],"StationaryP":result_non[1] ,"Order": stepwise_nonseasonal.order,"Predicted":predictions_non[0],"Expected":test_non[0],"STDEV":test_df_nonseasonal['Quantity'].std(),"rmse":rmse_non},ignore_index=True)
    except:
       continue

Any advice for forecasting of multiple products would be great!

Upvotes: 1

Views: 1102

Answers (0)

Related Questions