Reputation: 843
Assume we have a time-series data that contains the daily orders count of last two years:
We can predict the future's orders using Python's statsmodels library:
fit = statsmodels.api.tsa.statespace.SARIMAX(
train.Count, order=(2, 1, 4),seasonal_order=(0,1,1,7)
).fit()
y_hat_avg['SARIMA'] = fit1.predict(
start="2018-06-16", end="2018-08-14", dynamic=True
)
Result (don't mind the numbers):
Now assume that our input data has some unusual increase or decrease, because of holidays or promotions in the company. So we added two columns that tell if each day was a "holiday" and a day that the company has had "promotion".
Is there a method (and a way of implementing it in Python) to use this new type of input data and help the model to understand the reason of outliers, and also predict the future's orders with providing "holiday" and "promotion_day" information?
fit1.predict('2018-08-29', holiday=True, is_promotion=False)
# or
fit1.predict(start="2018-08-20", end="2018-08-25", holiday=[0,0,0,1,1,0], is_promotion=[0,0,1,1,0,1])
Upvotes: 6
Views: 7720
Reputation: 6554
SARIMAX
, as a generalisation of the SARIMA
model, is designed to handle exactly this. From the docs,
Parameters:
- endog (array_like) – The observed time-series process y;
- exog (array_like, optional) – Array of exogenous regressors, shaped
(nobs, k)
.
You could pass the holiday
and promotion_day
as an array of size (nobs, 2)
to exog
, which will inform the model of the exogenous nature of some of these observations.
Upvotes: 7
Reputation: 171
Try this (it may or may not work based on your problem/data):
You can split your date into multiple features like day of week, day of month, month of year, year, is it last day in month?, is it first day in month? and many more if you think of it and then use some normal ML algorithm like Random Forests or Gradient Boosting Trees or Neural Networks (specially with embedding layers for your categorical features e.g. day of week) to train your model.
Upvotes: 0
Reputation: 703
Although it's not from statsmodels
, you can use facebook's prophet library for time series forecasting where you can pass dates with recurring events to your model.
See here.
Upvotes: 0
Reputation: 570
This problem have different names such as anomaly detection
, rare event detection
and extreme event detection
.
There is some blog post at Uber engineering blog that may useful for understanding the problem and solution. Please look at here and here.
Upvotes: 0