Reputation: 482
I have been running seasonal_decompose() from the statsmodels on about 20 totally different datasets. Is it standard that the seasonality is 7 when looking at a dataset with day frequency?
Here is a picture as an example of one dataset decomp. I zoomed in on the seasonality so that you can see that it is again 7 days:
Why is it always 7 days though? I wouldn't expect it to be always 7 days and the datasets are all different from each other, so by now I think that either this is total coincidence or this is because of seasonal_decompose().
But looking at how seasonal_decompose() in the statsmodels documentation , it uses LOESS to figure out the seasonality. If I look at the formula, it should be able to find different frequencies of the seasonality. I just need to verify that I am not wrong here: Is it pure coincidence that all of my datasets produce a 7 day frequency of the seasonality?
Upvotes: 0
Views: 784
Reputation: 1481
First of all, seasonal_decompose
has nothing to do with LOESS, for decomposition based on LOESS you need to use statsmodels.tsa.seasonal.STL
. seasonal_decompose
does not infer periodicity based on data in any way. You only have two options:
period
argumentperiod
argument at None
. In this case you have to feed pandas
dataframe with datetime index to seasonal_decompose
, and periodicity will be inferred from datetime index frequency label, otherwise it will throw an error. It first fetches frequency label: pfreq = getattr(getattr(x, "index", None), "inferred_freq", None)
(in your case frequency label will be 'D'
, meaning daily), then it converts it to periodicity using statsmodels.tsa.tsatools.freq_to_period
(in your case frequency label 'D'
will be converted to 7
, and that will be used as periodicity, hence the results you get)Upvotes: 1