Reputation: 51
I have spent the best part of the last few days searching forums and reading papers trying to solve the following question. I have thousands of time series arrays each of varying lengths containing a single column vector. this column vector contains the time between clicks for dolphins using echolocation.
I have managed to cluster these into similar groups using DTW and want to check which trains have a high degree of similarity i.e repeated patterns. I only want to know the similarity with themselves and don't care to compare them with other trains as I have already applied DTW for that. I'm hoping some of these clusters will contain trains with a high proportion of repeated patterns.
I have already applied the Ljung–Box test to each series to check for autocorrelation but think i should maybe be using something with FFT and the power spectrum. I don't have much experience in this but have tried to do so using a Python package waipy. Ultimately, I just want to know if there is some kind of repeated pattern in the data ideally tested with a p-value. The image I have attached shows an example train across the top. the maximum length of my trains is 550.
I know this is quite a complex question but any help would be greatly appreciated even if it is a link to a helpful Python library.
Thanks,
Dex
Upvotes: 2
Views: 2478
Reputation: 51
For anyone in a similar position I decided to go with Motifs as they are able to find a repeated pattern in a time series using euclidian distance. There is a really good package in Python called Stumpy which makes this very easy!
Thanks,
Dex
Upvotes: 3