Reputation: 2624
I am trying to understand what the scipy.signal.spectrogram()
's output are, and how to use them. Currently, I read a .wav
file and generate a spectrogram.
from scipy.io import wavfile as wav
from scipy import signal
sample_rate, data = wav.read('sound.wav')
f, t, Sxx = signal.spectrogram(data, sample_rate)
--
In case understanding this completely wrong, my idea of a spectrogram is a 3D graph consisting of:
x-axis: time
y-axis: frequency
pixel colour/brightness: amplitude
So I'm wondering how f
, t
and Sxx
relate to the time
, frequency
, and amplitude
.
Thanks for reading, any help is appreciated!
Upvotes: 2
Views: 1194
Reputation: 1026
f is the frequency array, containing the frequencies of every band of the fft. Which can be used as the labels for a graph
t is the time array, containing the time at which this FFT was made relative to the source signal. Again can be used for labels.
The Sxx array contains the amplitudes and is a 2d array whose shape is the length of f by the length of t.
Therefore the axis which matches the length of the time array is the time axis and the other the frequency.
You will need to find the min and max values of the Sxx array yourself, if you want to normalise for display.
Upvotes: 2