Alex
Alex

Reputation: 159

Simple tests for seasonality in Python

I've been trying to find a way to iterate over multiple time series (for a key that is "customer") and return for each time series a simple yes/no of whether it has a solid seasonal component.

I know there's a relatively new library in R for this (seastests) - is there something in Python I can use without plotting each time series?

Upvotes: 2

Views: 2147

Answers (1)

Irene
Irene

Reputation: 111

I know this is old but I've just faced a similar problem.

I'd suggest to use SARIMAX Model for seasonal decomposition of the data series for each customer and then calculate and use the seasonality parameters p,d,q (P: Seasonal Autoregressive order - D: Seasonal difference order - Q: Seasonal moving average order) to determine if the series is seasonal. If those parameters are = 0, for minimum AIC we can assume the series is not seasonal (y_ds is the series data values):

#customize data as per problem needs - in my case I have ds as date, customer_id and y as series values
data = pd.DataFrame({'ds': pd.to_datetime(data.ds), 'customer_id' : data['customer_id'], 'y' : data['y']})
data.set_index('ds', inplace=True)
customers = data['customer_id'].unique()

  for i in range(len(customers)):

        datai = data.loc[data['customer_id'] == customers[i]]
        datai['log_y'] = np.log(datai['y'])
        y_ds = datai['log_y'].resample('MS').mean()
        y_ds = y_ds.fillna(0)

        p = d = q = range(0, 2)
        pdq = list(itertools.product(p, d, q))
        seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]

        cols = ['param','seasonal_param','AIC']
        lst = pd.DataFrame(columns=cols)
        for param in pdq:
            for param_seasonal in seasonal_pdq:
                try:
                    mod = sm.tsa.statespace.SARIMAX(y_ds,
                                            order=param,
                                            seasonal_order=param_seasonal,
                                            enforce_stationarity=False,
                                            enforce_invertibility=False)
                    results = mod.fit()
                    lst = lst.append({'param':param, 'seasonal_param':param_seasonal, 'AIC':results.aic},ignore_index=True)
                except:
                    continue

        order_param = lst.param[lst.AIC == lst.AIC.min()]
        order_param = order_param.values[0]
        seasonal_order_param = lst.seasonal_param[lst.AIC == lst.AIC.min()]
        seasonal_order_param = seasonal_order_param.values[0]
    
        if seasonal_order_param == (0,0,0,12):
            pattern = 'NON SEASONAL'
        else:
            pattern = 'SEASONAL'

        df = pd.DataFrame({'Trend': pattern,'customer_id': customers[i]})
        df_final = pd.concat([df_final, df])

Upvotes: 1

Related Questions