sai akhil
sai akhil

Reputation: 1

AutoTS: Forecast with top3 models instead of best model

I am using the AutoTS package for multivariate time series forecasting. The idea is to fit the model on a train set and forecast for validation set because I want to get metrics for each individual column. Pick top 3 models from validation and forecast for next 4 quarters of the entire data.

I am performing a train-validation split to compare the mean absolute error (MAE) results of individual columns with other univariate time series models. This approach ensures a common comparison metric. If there are better methods for achieving this, please suggest them.

    # Prepare the data for AutoTS
    model = AutoTS(
        forecast_length=2,
        frequency='infer',
        model_list=['NVAR','MAR','MultivariateRegression','NeuralForecast','DynamicFactorMQ'],
        transformer_list="fast",  # "superfast", "default", "fast_parallel"
        transformer_max_depth=2,
        max_generations=1,
        num_validations=1,
        validation_method='backwards',
        no_negatives=True,
        remove_leading_zeroes=True
    )
    
    model = model.fit(
        train_df,
        date_col='year-quarter-date',
        value_col='value',
        id_col='vial_size'
    )
    
    # Get the top 3 models based on the score
    top_3_models = model.results().sort_values(by='Score').head(3)
    
    # Calculate MAE for each of the top 3 models
    for _, row in top_3_models.iterrows():
        model_name = row['Model']
        model_param_dict = row['ModelParameters']
        model_transform_dict = row['TransformationParameters']
        
        # Calculate next two quarters' year-quarter-date
        last_date = pd.to_datetime(train_df['year-quarter-date'].max())
        next_quarters = pd.date_range(last_date, periods=3, freq='Q')[1:]
        next_quarters_str = next_quarters.to_period('Q').astype(str)
        
        # Predict the next two quarters using the predict method
        forecast_result = model_forecast(
            model_name=model_name,
            model_param_dict=model_param_dict,
            model_transform_dict=model_transform_dict,
            df_train=train_df,
            forecast_length=2,
            frequency='Q',
            no_negatives=True,
            remove_leading_zeroes=True,
        )
    models = top_models[molecule]
    for _, model in models:
        model.forecast_length = 4
        model = model.fit(
            molecule_df,
            date_col='year-quarter-date',
            value_col='value',
            id_col='vial_size'
        )
        prediction = model.predict()
        forecast = prediction.forecast

This is taking more time to work through and I want to understand if there is a better way to do what I want to achieve. Process Flow:

  1. Split the data into train and validation(last 2 quarters)
  2. Fit autoTS model(forecast length=2) to get mae scores for even the individual columns
  3. Pick top3 models from this
  4. Forecast using these top 3 models for the next 4 quarters

Upvotes: 0

Views: 92

Answers (0)

Related Questions