Reputation: 3355
I want to use the sum of first five fft coefficients as a feature for a classifier (in Python language). I tried a few resources but I can't get a grasp of this concept. For example, I have an array of 10 elements.
a = [ 1, 2, 3, 4, 1, 1, 1, 1, 1, 1] # Lets say, it represent discrete values of the x-axis of an accelerometer
if I apply fft in Python to this array, I get the following output:
array([ 16.00000000+0.j , 0.50000000-5.34306783j,
-3.73606798-0.36327126j, 0.50000000+1.98786975j,
0.73606798-1.53884177j, -2.00000000+0.j ,
0.73606798+1.53884177j, 0.50000000-1.98786975j,
-3.73606798+0.36327126j, 0.50000000+5.34306783j])
if I apply rfft (real fft) in Python to this array, I get the following output:
array([ 16. , 0.5 , -5.34306783, -3.73606798,
-0.36327126, 0.5 , 1.98786975, 0.73606798,
-1.53884177, -2. ])
How can I calculate the the sum of first five coefficients from these two outputs?
In case of rfft: Should it be just the sum of absolute values of the first five values?
Upvotes: 1
Views: 1275
Reputation: 14577
- Can someone explain the difference between these two outputs? Shouldn't
rfft
just display the real part of the fft?
rfft
efficiently computes the FFT of a real-valued input sequence whereas fft
computes the FFT of a possibly complex-valued input sequence. If the input sequence happen to be purely real, fft
will return an equivalent output, within some numerical accuracy and packaging considerations. More specifically for the packaging, rfft
avoid returning the upper half of the spectrum which happens to be symmetric when computing the FFT of a real-valued input. It also avoids returning the imaginary part of the DC (0Hz) bin and of the Nyquist frequency (half the sampling rate) bin since those are always zero when dealing with real-valued inputs.
So, the output from fft.fft
of your example can be mapped to the following outputs of fft.rfft
:
16.00000000+0.j -> rfft[0]
0.50000000-5.34306783j -> rfft[1], rfft[2]
-3.73606798-0.36327126j -> rfft[3], rfft[4]
0.50000000+1.98786975j -> rfft[5], rfft[6]
0.73606798-1.53884177j -> rfft[7], rfft[8]
-2.00000000+0.j -> rfft[9]
- How can I calculate the sum of first five coefficients from these two outputs? In case of
rfft
: should it be just the sum of absolute values of the first five values?
As observed from the different packaging of the outputs, the first 5 complex-valued coefficients of fft.fft
correspond to the first 9 floating point values returned by fft.rfft
. To compute the sum you will have to compute separately the sum on the real parts and on the imaginary parts. So, for the sum of the first five coefficients this would give you something like:
A = np.fft.rfft(a);
sum_re = A[0] + A[1] + A[3] + A[5] + A[7];
sum_im = A[2] + A[4] + A[6] + A[8];
Upvotes: 4