Reputation: 1689
I am trying to find the frequency of an array of numbers taken from a wav file using Fast Fourier Transform and numpy, however I am receiving an output of the wrong frequency.
Here is my code:
from pydub import AudioSegment
import numpy as np
np.set_printoptions(threshold=np.inf)
sound = AudioSegment.from_mp3("500Hz.wav")
raw_data = sound.raw_data
raw_data = np.fromstring(raw_data, dtype=np.int16)
print(raw_data[:2000:21])
wave = raw_data
fft = np.fft.rfft(wave)
fft = np.abs(fft)
print(max(list(fft)))
print(list(fft).index(max(list(fft))))
fft = np.array([int(i) for i in fft])
The 500Hz.wav file is a 500Hz audio wave for 3 seconds created using Audacity.
The code returns the following:
[ 0 26138 3906 -25559 -7727 24402 11370 -22702 -14767 20496
17830 -17830 -20498 14763 22701 -11374 -24400 7728 25557 -3907
-26140 0 26141 3905 -25555 -7728 24404 11373 -22704 -14767
20496 17831 -17829 -20493 14765 22698 -11375 -24404 7725 25553
-3906 -26138 -1 26141 3907 -25559 -7726 24402 11375 -22702
-14765 20497 17830 -17831 -20498 14762 22700 -11374 -24401 7726
25557 -3906 -26141 2 26139 3912 -25556 -7728 24401 11376
-22702 -14767 20499 17830 -17830 -20496 14766 22704 -11372 -24405
7725 25559 -3906 -26141 -1 26139 3906 -25556 -7725 24404
11373 -22702 -14769 20495 17831 -17832]
2046217405.9084692
1770
This shows that the peak is at 1770Hz and not at 500Hz and I am unsure what is causing this. If I am missing any information, please let me know so I can add it to the question!
Edit: The file is available at https://ufile.io/nk7j9
Upvotes: 1
Views: 1018
Reputation: 9817
The frequency corresponding to the index 1770 depends on the duration of the frame. For instance, if the frame lasts 3
seconds, the frequency of index i
is i/3
Hz. The zero-frequency corresponds to the average, or DC component of the signal: it is presently null. It the index 1770 corresponds to 500Hz, the duration of the frame is likely about 3.22 seconds. Since the file is read using pydub, this duration can be retreived in miliseconds using len(sound)
Even if the signal is a sine wave, the peak of the discrete Fourier transform (DFT) may span over multiple frequencies. It occurs whenever the length of the frame is not a multiple of the period of the sine wave and it is called spectral leakage. It can be tempered by applying a window before the DFT.
Finally, the period of the sine wave used as input is likely different from the dicrete frequencies of the DFT. Hence, estimating the frequency of the sine wave using the index of the highest amplitude can be slightly erroneous. To correct that, the actual freqency of a peak of DFT amplitude can be estimated as its mean frequency wih respect to power density See Why are frequency values rounded in signal using FFT?
Upvotes: 1