Reputation: 4792
Sorry if I submit a duplicate, but I wonder if there is any lib in python which makes you able to extract sound spectrum from audio files. I want to be able to take an audio file and write an algoritm which will return a set of data {TimeStampInFile; Frequency-Amplitude}.
I heard that this is usually called Beat Detection, but as far as I see beat detection is not a precise method, it is good only for visualisation, while I want to manipulate on the extracted data and then convert it back to an audio file. I don't need to do this real-time.
I will appreciate any suggestions and recommendations.
Upvotes: 10
Views: 32362
Reputation: 779
You can compute and visualize the spectrum and the spectrogram this using scipy, for this test i used this audio file: vignesh.wav
from scipy.io import wavfile # scipy library to read wav files
import numpy as np
AudioName = "vignesh.wav" # Audio File
fs, Audiodata = wavfile.read(AudioName)
# Plot the audio signal in time
import matplotlib.pyplot as plt
plt.plot(Audiodata)
plt.title('Audio signal in time',size=16)
# spectrum
from scipy.fftpack import fft # fourier transform
n = len(Audiodata)
AudioFreq = fft(Audiodata)
AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))] #Half of the spectrum
MagFreq = np.abs(AudioFreq) # Magnitude
MagFreq = MagFreq / float(n)
# power spectrum
MagFreq = MagFreq**2
if n % 2 > 0: # ffte odd
MagFreq[1:len(MagFreq)] = MagFreq[1:len(MagFreq)] * 2
else:# fft even
MagFreq[1:len(MagFreq) -1] = MagFreq[1:len(MagFreq) - 1] * 2
plt.figure()
freqAxis = np.arange(0,int(np.ceil((n+1)/2.0)), 1.0) * (fs / n);
plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum
plt.xlabel('Frequency (kHz)'); plt.ylabel('Power spectrum (dB)');
#Spectrogram
from scipy import signal
N = 512 #Number of point in the fft
f, t, Sxx = signal.spectrogram(Audiodata, fs,window = signal.blackman(N),nfft=N)
plt.figure()
plt.pcolormesh(t, f,10*np.log10(Sxx)) # dB spectrogram
#plt.pcolormesh(t, f,Sxx) # Lineal spectrogram
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [seg]')
plt.title('Spectrogram with scipy.signal',size=16);
plt.show()
i tested all the code and it works, you need, numpy, matplotlib and scipy.
cheers
Upvotes: 15
Reputation: 23510
I think your question has three separate parts:
1. How to load audio files in python?
You are probably best off by using scipy
, as it provides a lot of signal processing functions. For loading audio files:
import scipy.io.wavfile
samplerate, data = scipy.io.wavfile.read("mywav.wav")
Now you have the sample rate (samples/s) in samplerate
and data as a numpy.array
in data
. You may want to transform the data into floating point, depending on your application.
There is also a standard python module wave
for loading wav-files, but numpy
/scipy
offers a simpler interface and more options for signal processing.
2. How to calculate the spectrum
Brief answer: Use FFT. For more words of wisdom, see:
Analyze audio using Fast Fourier Transform
Longer answer is quite long. Windowing is very important, otherwise you'll have strange spectra.
3. What to do with the spectrum
This is a bit more difficult. Filtering is often performed in time domain for longer signals. Maybe if you tell us what you want to accomplish, you'll receive a good answer for this one. Calculating the frequency spectrum is one thing, getting meaningful results with it in signal processing is a bit more complicated.
(I know you did not ask this one, but I see it coming with a probability >> 0. Of course, it may be that you have good knowledge on audio signal processing, in which case this is irrelevant.)
Upvotes: 11