Reputation: 4492
I have the following sequence:
states_list = ['H', 'M', 'M', 'M', 'H', 'H', 'H', 'H', 'C', 'C', 'H', 'H', 'C', 'C', 'H', 'A', 'A', 'A', 'A', 'A', 'S', 'S', 'S', 'A', 'S', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'C', 'H', 'H', 'H', 'H', 'H', 'S', 'H', 'S', 'S', 'S', 'H', 'H', 'H', 'H', 'H', 'H', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'H', 'H', 'H', 'H', 'H', 'C', 'C', 'C', 'A', 'C', 'C', 'A', 'A', 'A', 'A', 'A', 'H', 'H', 'H', 'H', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C']
Is there a way to find "seasonality" on this time series ?
By "seasonality" I mean, if there is a specific a specific sub-sequence of letters popping up every "n" letters
Upvotes: 0
Views: 225
Reputation: 1842
The standard technique for seasonality detection is lagged auto correlation plot. That is, you shift your series by various time lags and check if the shifted series is correlated with the original (google acf and acf plot).
Now you have a categorical time series, so standard stuff won't work out of the box. I googled briefly, don't find anything ready made, but all the ingredients are there.
The main of which is the correlation for categorical variables, and that's Cramer's V. For example here https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.contingency.association.html.
Then you will need to write some code that for each k=1, 2, 3, ... shifts the series by k, computes the Cramer's V correlation between shifted and unshifted, and saves the result.
Afther that plot k vs. correlations and see if things stand out.
Upvotes: 1