Reputation: 159
I've been trying to find a way to iterate over multiple time series (for a key that is "customer") and return for each time series a simple yes/no of whether it has a solid seasonal component.
I know there's a relatively new library in R for this (seastests) - is there something in Python I can use without plotting each time series?
Upvotes: 2
Views: 2147
Reputation: 111
I know this is old but I've just faced a similar problem.
I'd suggest to use SARIMAX Model for seasonal decomposition of the data series for each customer and then calculate and use the seasonality parameters p,d,q (P: Seasonal Autoregressive order - D: Seasonal difference order - Q: Seasonal moving average order) to determine if the series is seasonal. If those parameters are = 0, for minimum AIC we can assume the series is not seasonal (y_ds is the series data values):
#customize data as per problem needs - in my case I have ds as date, customer_id and y as series values
data = pd.DataFrame({'ds': pd.to_datetime(data.ds), 'customer_id' : data['customer_id'], 'y' : data['y']})
data.set_index('ds', inplace=True)
customers = data['customer_id'].unique()
for i in range(len(customers)):
datai = data.loc[data['customer_id'] == customers[i]]
datai['log_y'] = np.log(datai['y'])
y_ds = datai['log_y'].resample('MS').mean()
y_ds = y_ds.fillna(0)
p = d = q = range(0, 2)
pdq = list(itertools.product(p, d, q))
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]
cols = ['param','seasonal_param','AIC']
lst = pd.DataFrame(columns=cols)
for param in pdq:
for param_seasonal in seasonal_pdq:
try:
mod = sm.tsa.statespace.SARIMAX(y_ds,
order=param,
seasonal_order=param_seasonal,
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()
lst = lst.append({'param':param, 'seasonal_param':param_seasonal, 'AIC':results.aic},ignore_index=True)
except:
continue
order_param = lst.param[lst.AIC == lst.AIC.min()]
order_param = order_param.values[0]
seasonal_order_param = lst.seasonal_param[lst.AIC == lst.AIC.min()]
seasonal_order_param = seasonal_order_param.values[0]
if seasonal_order_param == (0,0,0,12):
pattern = 'NON SEASONAL'
else:
pattern = 'SEASONAL'
df = pd.DataFrame({'Trend': pattern,'customer_id': customers[i]})
df_final = pd.concat([df_final, df])
Upvotes: 1