Rlz
Rlz

Reputation: 1689

Python - Fourier transform wrong frequency

I am trying to find the frequency of an array of numbers taken from a wav file using Fast Fourier Transform and numpy, however I am receiving an output of the wrong frequency.

Here is my code:

from pydub import AudioSegment
import numpy as np

np.set_printoptions(threshold=np.inf)

sound = AudioSegment.from_mp3("500Hz.wav")

raw_data = sound.raw_data
raw_data = np.fromstring(raw_data, dtype=np.int16)

print(raw_data[:2000:21])

wave = raw_data 
fft = np.fft.rfft(wave)
fft = np.abs(fft)

print(max(list(fft)))
print(list(fft).index(max(list(fft))))

fft = np.array([int(i) for i in fft])

The 500Hz.wav file is a 500Hz audio wave for 3 seconds created using Audacity.

The code returns the following:

[     0  26138   3906 -25559  -7727  24402  11370 -22702 -14767  20496
  17830 -17830 -20498  14763  22701 -11374 -24400   7728  25557  -3907
 -26140      0  26141   3905 -25555  -7728  24404  11373 -22704 -14767
  20496  17831 -17829 -20493  14765  22698 -11375 -24404   7725  25553
  -3906 -26138     -1  26141   3907 -25559  -7726  24402  11375 -22702
 -14765  20497  17830 -17831 -20498  14762  22700 -11374 -24401   7726
  25557  -3906 -26141      2  26139   3912 -25556  -7728  24401  11376
 -22702 -14767  20499  17830 -17830 -20496  14766  22704 -11372 -24405
   7725  25559  -3906 -26141     -1  26139   3906 -25556  -7725  24404
  11373 -22702 -14769  20495  17831 -17832]
2046217405.9084692
1770

This shows that the peak is at 1770Hz and not at 500Hz and I am unsure what is causing this. If I am missing any information, please let me know so I can add it to the question!

Edit: The file is available at https://ufile.io/nk7j9

Upvotes: 1

Views: 1018

Answers (1)

francis
francis

Reputation: 9817

The frequency corresponding to the index 1770 depends on the duration of the frame. For instance, if the frame lasts 3 seconds, the frequency of index i is i/3 Hz. The zero-frequency corresponds to the average, or DC component of the signal: it is presently null. It the index 1770 corresponds to 500Hz, the duration of the frame is likely about 3.22 seconds. Since the file is read using pydub, this duration can be retreived in miliseconds using len(sound)

Even if the signal is a sine wave, the peak of the discrete Fourier transform (DFT) may span over multiple frequencies. It occurs whenever the length of the frame is not a multiple of the period of the sine wave and it is called spectral leakage. It can be tempered by applying a window before the DFT.

Finally, the period of the sine wave used as input is likely different from the dicrete frequencies of the DFT. Hence, estimating the frequency of the sine wave using the index of the highest amplitude can be slightly erroneous. To correct that, the actual freqency of a peak of DFT amplitude can be estimated as its mean frequency wih respect to power density See Why are frequency values rounded in signal using FFT?

Upvotes: 1

Related Questions