isthisreal
isthisreal

Reputation: 21

How can I get better forecasting results with Prophet cross validation?

I have daily demand data of 10 years with a positive trend. https://gofile.io/?c=PS3YCO

For the last three months of a year there is always a demand shock on 1-2. and 15-16. of the month (promotions).

I tried forecasting it with:

future = m.make_future_dataframe(periods=365)

forecast = m.predict(future)

But the result was not what I expected. The best MSE I could get was 6681. But when I try cross validation the result is almost the same with 6690.

Also: when I use "from fbprophet.diagnostics import performance_metrics" to calculate the MSE it doest give me the values for the Test data but for a longer period. How can I just calculate the MSE for the last year?

Thanks you so much for your help :)

import numpy as np
from fbprophet import Prophet
import matplotlib.pyplot as plt
from fbprophet.diagnostics import cross_validation

df = pd.read_excel('Dataset2.3_kurz.xls')   

promotions = pd.DataFrame({ 
 'holiday': 'winter_promotion',
    'ds': pd.to_datetime(['2009-10-1','2009-10-2','2009-10-15','2009-10-16',
                          '2009-11-1','2009-11-2','2009-11-15','2009-11-16',
                          '2009-12-1','2009-12-2','2009-12-15','2009-12-16',
                          '2010-10-1','2010-10-2','2010-10-15','2010-10-16',
                          '2010-11-1','2010-11-2','2010-11-15','2010-11-16',
                          '2010-12-1','2010-12-2','2010-12-15','2010-12-16',
                          '2011-10-1','2011-10-2','2011-10-15','2011-10-16',
                          '2011-11-1','2011-11-2','2011-11-15','2011-11-16',
                          '2011-12-1','2011-12-2','2011-12-15','2011-12-16',
                          '2012-10-1','2012-10-2','2012-10-15','2012-10-16',
                          '2012-11-1','2012-11-2','2012-11-15','2012-11-16',
                          '2012-12-1','2012-12-2','2012-12-15','2012-12-16',
                          '2013-10-1','2013-10-2','2013-10-15','2013-10-16',
                          '2013-11-1','2013-11-2','2013-11-15','2013-11-16',
                          '2013-12-1','2013-12-2','2013-12-15','2013-12-16',
                          '2014-10-1','2014-10-2','2014-10-15','2014-10-16',
                          '2014-11-1','2014-11-2','2014-11-15','2014-11-16',
                          '2014-12-1','2014-12-2','2014-12-15','2014-12-16',
                          '2015-10-1','2015-10-2','2015-10-15','2015-10-16',
                          '2015-11-1','2015-11-2','2015-11-15','2015-11-16',
                          '2015-12-1','2015-12-2','2015-12-15','2015-12-16',
                          '2016-10-1','2016-10-2','2016-10-15','2016-10-16',
                          '2016-11-1','2016-11-2','2016-11-15','2016-11-16',
                          '2016-12-1','2016-12-2','2016-12-15','2016-12-16',
                          '2017-10-1','2017-10-2','2017-10-15','2017-10-16',
                          '2017-11-1','2017-11-2','2017-11-15','2017-11-16',
                          '2017-12-1','2017-12-2','2017-12-15','2017-12-16',
                          '2018-10-1','2018-10-2','2018-10-15','2018-10-16',
                          '2018-11-1','2018-11-2','2018-11-15','2018-11-16',
                          '2018-12-1','2018-12-2','2018-12-15','2018-12-16',
                          '2019-10-1','2019-10-2','2019-10-15','2019-10-16',
                          '2019-11-1','2019-11-2','2019-11-15','2019-11-16',
                          '2019-12-1','2019-12-2','2019-12-15','2019-12-16']),
    'lower_window': 0, 
    'upper_window': 0, 
})

#model
m = Prophet( growth='linear',
           holidays=promotions,
           seasonality_mode='multiplicative',

           holidays_prior_scale=10,  
           seasonality_prior_scale=10,            
            yearly_seasonality=True,
           )

m.fit(df)  

df_cv = cross_validation(m, initial='732 days', period='365 days', horizon = '366 days')

from fbprophet.diagnostics import performance_metrics
df_p = performance_metrics(df_cv)
df_p = df_p[-365:]
df_p.tail()

Upvotes: 1

Views: 3815

Answers (1)

Anirudh Lakhotia
Anirudh Lakhotia

Reputation: 39

Try a grid search and tune the hyperparameters. Playing around with the changepoint_prior_scale parameter might help too. This is just a personal opinion but setting yearly seasonality to False and adding it externally with a new fourier order and priority could help too. About cross_validation,check this link for a better understanding.

Upvotes: 1

Related Questions