Reputation: 345
Assume I have a time series t with one hundred measurements, each entry representing the measured value for each day. I assume there is some periodicity in the signal -- it might repeat daily, weekly or monthly.
Translating the time series into the Fourier domain might help to find such a periodicity?
How could I use numpy's fft module to find the likeliest period for my time series?
Upvotes: 12
Views: 12791
Reputation: 359
There is another method that does not rely on Fourier Series. This method helps you identify if the signal is periodic. Ideally, the time series, in this case, should be binary:
[0,1,0,0,0,1,1,0,0,0,1,0,0,1]
You first compute the distribution of the distances between the locations of consecutive peaks. Then, you compute the factor
-1<B<1:
B = (var - mean)/(var + mean)
With mean
and varrespectively the mean and the variance of the computed distribution. The closer
Bto
-1the more periodic the signal is. If
B` close to 0, then there is no periodicity in the signal and the peaks are located randomly in the time series.
For further information, look up for the keyword Burstiness.
Upvotes: 1
Reputation: 70703
Note than an FFT finds a sinusoidal decomposition, which is different from a periodicity estimator (as any fundamental period can be completely missing from a periodic signal's frequency spectrum. See missing fundamental )
So you will need to post-process your FFT result with something like a cepstrum (using complex cepstral analysis, etc.), or perhaps use a Harmonic Product Spectrum estimator.
Upvotes: 1
Reputation: 345
I will aim to answer my own question. You may correct me where appropriate.
Asume our time series t is t = [2,1,0,1,2,3,2,1,0,1,2,3,2,1,0,1,2,3]
with 18 measurements. A rather simple example: It seems likely that the length of the period is six time units.
Taking this time series into the Frequency Domain yields us:
w = numpy.fft.fft(t)
print numpy.absolute(w)
array([27.000000, 0.000000, 0.000000, 12.000000, 0.000000, 0.000000,
0.000000, 0.000000, 0.000000, 3.000000, 0.000000, 0.000000,
0.000000, 0.000000, 0.000000, 12.000000, 0.000000, 0.000000])
We ignore frequency 0 and observe that the value is largest for index 3 -- this indicates that within our time series t the signal repeats 3 times. Hence the length of the signal -- the period -- would be 18/3 = 6. And indeed:
numpy.fft.fftfreq(18)
array([ 0. , 0.055556, 0.111111, 0.166667, 0.222222, 0.277778,
0.333333, 0.388889, 0.444444, -0.5 , -0.444444, -0.388889,
-0.333333, -0.277778, -0.222222, -0.166667, -0.111111, -0.055556])
The frequency at index 3 is exactly 1/6, i.e. the frequency for one time unit is 1/6, meaning the signal takes six time units for a full period.
Please let me know if my understanding is correct.
Upvotes: 6